jlouis / etorrent

Erlang Bittorrent Client
BSD 2-Clause "Simplified" License
294 stars 50 forks source link

Add the ability to preallocate files on disk. #56

Closed jlouis closed 13 years ago

jlouis commented 13 years ago

Some (old) filesystems, like FreeBSDs base file system, are quite bad at handling files "a little bit at a time", so when they save the data, they fragment it to the point where it is not even funny.

To mitigate this problem, one can start by filling up the file on disk with 0's so data is allocated when we get to playing with them. Newer Erlang releases might get access to fallocate() which will help, but only on Linux. It may also be that this behavior is the right one on Windows, so implementing it should help there as well.

ghost commented 13 years ago

How is this different from what is done by etorrent_fs_checker:fill_file?

jlouis commented 13 years ago

etorrent_fs_checker:fill_file/2 Just moves the file position to the end and writes a '0' there. This means that a sparse file is created, i.e., one where most of the file is never allocated on disk.

Different file systems handle these files differently. Some can cope pretty well with being supplied the file contents a little bit at a time. But other systems can't, FreeBSDs for instance. What happens is that the file is fragmented into thousands of pieces with very bad disk I/O times as the result (A disk which can straight off read some 30 megabyte / sec is reduced to a 3 MB / sec snail).

The best solution is to create a knob where we will explicitly fill the file with 0's by writing it to disk. There is some old code in the repository somewhere doing this effectively, so it is somewhat easy... just dig that up.

In the future, we hope we get access to the fallocate() call, which fdmana here at github provided a patch for. This is perfect for operating systems which do support fallocation beforehand.

jlouis commented 13 years ago

I have some code for this one, so my FreeBSD testing doesn't kill me. I expect a merge of it soon to 'next'.

jlouis commented 13 years ago

The code has been tested a bit and seems to work. I am now considering to add a little wait between some of the writes because we can currently utterly destroy the disk I/O by taking all of the elevator to us and never give other applications a chance. This is not what we want. We rather want to use time to our advantage.

jlouis commented 13 years ago

This beast is on 'master' now.