lavv17 / lftp

sophisticated command line file transfer program (ftp, http, sftp, fish, torrent)
http://lftp.yar.ru
GNU General Public License v3.0
1.08k stars 159 forks source link

Allow quick pre-allocation of files on Window, probably via sparse files #719

Open ksa-real opened 6 months ago

ksa-real commented 6 months ago

Right now downloading files onto NTFS in Windows effectively involves double write. Initially the file is pre-allocated and filled with zeroes, which involves actual write of zeroes and is slow for large files, and then the actual download happens.

The ask is to allow creating such files as sparse. Preallocation becomes instant. Write is the same speed as for normal pre-allocated files (probably even more efficient for SSD as no trim is necessary). I've tried the dos commands from here. Haven't checked c/posix equivalents yet. I checked that such files are supported by systems with incomplete NTFS support (e.g. Linux routers) opposing to compressed NTFS files.

What do you think?

lavv17 commented 6 months ago

This is a good idea. But probably it's better to disable preallocation on windows, I'm not sure if pre-allocation on sparse files actually works.

Alexander.

On Fri, 29 Dec 2023 at 13:35 Sergei Kuzmin @.***> wrote:

Right now downloading files onto NTFS in Windows effectively involves double write. Initially the file is pre-allocated and filled with zeroes, which involves actual write of zeroes and is slow for large files, and then the actual download happens.

The ask is to allow creating such files as sparse. Preallocation becomes instant. Write is the same speed as for normal pre-allocated files (probably even more efficient for SSD as no trim is necessary). I've tried the dos commands from here https://superuser.com/a/314319/1699093. Haven't checked c/posix equivalents yet. I checked that such files are supported by systems with incomplete NTFS support (e.g. Linux routers) opposing to compressed NTFS files.

What do you think?

— Reply to this email directly, view it on GitHub https://github.com/lavv17/lftp/issues/719, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAHLWXDPJD4LXGD52UIVRKLYL2MG7AVCNFSM6AAAAABBGOTA4SVHI2DSMVQWIX3LMV43ASLTON2WKOZSGA2TSMRZGY2TQMI . You are receiving this because you are subscribed to this thread.Message ID: @.***>

ksa-real commented 6 months ago

To be clear, by preallocation I mean the actions like with the above link:

Windows cmd.exe script:

Type NUL > sparsefile
FSUtil Sparse SetFlag sparsefile
FSUtil Sparse SetRange sparsefile 0 0x40000000
FSUtil File SetEOF sparsefile 0x40000000

After these command we end up in a file of any wanted size logically filled with zeroes but occupying virtually zero disk space. During parallel downloading (--use-pget-n N) I assume Windows will recognize sequential access and will be allocating large continuous blocks per each stream. The allocation can be checked with fsutil file layout c:\some\file.

I did some experiments by writing random bytes inside the sparse file. Powershell script:

$filePath = "C:\path\to\your\sparsefile"
$fileStream = [System.IO.File]::Open($filePath, [System.IO.FileMode]::Open, [System.IO.FileAccess]::Write)
$randomBytes = New-Object Byte[] 1MB
$random = New-Object Random
$random.NextBytes($randomBytes)
$fileStream.Write($randomBytes, 0, $randomBytes.Length)
$fileStream.Close()

After writing to a specific location, I was checking how allocation looks like. I haven't checked how sectors/clusters are actually allocated on the disk.

Do you mean that there is some explicit action by lftp to fill the file with zeros which can be avoided?