Open ksa-real opened 6 months ago
This is a good idea. But probably it's better to disable preallocation on windows, I'm not sure if pre-allocation on sparse files actually works.
Alexander.
On Fri, 29 Dec 2023 at 13:35 Sergei Kuzmin @.***> wrote:
Right now downloading files onto NTFS in Windows effectively involves double write. Initially the file is pre-allocated and filled with zeroes, which involves actual write of zeroes and is slow for large files, and then the actual download happens.
The ask is to allow creating such files as sparse. Preallocation becomes instant. Write is the same speed as for normal pre-allocated files (probably even more efficient for SSD as no trim is necessary). I've tried the dos commands from here https://superuser.com/a/314319/1699093. Haven't checked c/posix equivalents yet. I checked that such files are supported by systems with incomplete NTFS support (e.g. Linux routers) opposing to compressed NTFS files.
What do you think?
— Reply to this email directly, view it on GitHub https://github.com/lavv17/lftp/issues/719, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAHLWXDPJD4LXGD52UIVRKLYL2MG7AVCNFSM6AAAAABBGOTA4SVHI2DSMVQWIX3LMV43ASLTON2WKOZSGA2TSMRZGY2TQMI . You are receiving this because you are subscribed to this thread.Message ID: @.***>
To be clear, by preallocation I mean the actions like with the above link:
Windows cmd.exe script:
Type NUL > sparsefile
FSUtil Sparse SetFlag sparsefile
FSUtil Sparse SetRange sparsefile 0 0x40000000
FSUtil File SetEOF sparsefile 0x40000000
After these command we end up in a file of any wanted size logically filled with zeroes but occupying virtually zero disk space. During parallel downloading (--use-pget-n N) I assume Windows will recognize sequential access and will be allocating large continuous blocks per each stream. The allocation can be checked with fsutil file layout c:\some\file
.
I did some experiments by writing random bytes inside the sparse file. Powershell script:
$filePath = "C:\path\to\your\sparsefile"
$fileStream = [System.IO.File]::Open($filePath, [System.IO.FileMode]::Open, [System.IO.FileAccess]::Write)
$randomBytes = New-Object Byte[] 1MB
$random = New-Object Random
$random.NextBytes($randomBytes)
$fileStream.Write($randomBytes, 0, $randomBytes.Length)
$fileStream.Close()
After writing to a specific location, I was checking how allocation looks like. I haven't checked how sectors/clusters are actually allocated on the disk.
Do you mean that there is some explicit action by lftp to fill the file with zeros which can be avoided?
Right now downloading files onto NTFS in Windows effectively involves double write. Initially the file is pre-allocated and filled with zeroes, which involves actual write of zeroes and is slow for large files, and then the actual download happens.
The ask is to allow creating such files as sparse. Preallocation becomes instant. Write is the same speed as for normal pre-allocated files (probably even more efficient for SSD as no trim is necessary). I've tried the dos commands from here. Haven't checked c/posix equivalents yet. I checked that such files are supported by systems with incomplete NTFS support (e.g. Linux routers) opposing to compressed NTFS files.
What do you think?