qbittorrent / qBittorrent

qBittorrent BitTorrent client
https://www.qbittorrent.org
Other
28.27k stars 3.98k forks source link

Slow Seed/Upload Speed #12535

Closed ACFiber closed 4 years ago

ACFiber commented 4 years ago

Please provide the following information

qBittorrent version and Operating System

qBittorrent 4.2.3 x64 on Windows 10

What is the problem

qBittorrent's download performance is amazing. It easily maxes out my 1Gbps down fiber connection. The problem qBittorrent is having is it's bad seeding performance. I have done a little experiment to double-check weather I'm wrong or not. I have a 500Mbps upload connection and tried seeding a torrent being the only seeder. My friend then tried to download the same torrent at his house just down the road using the same internet provider. He has a 1Gbps down connection as well and a decent PC running a 3900x just like mine, keeping in mind we're both using M.2 SSDs. He started downloading the torrent and successfully connected to me just like it was meant to but the speed I was seeding at was SO LOW. I was seeding at around 5-6MiB/s and it would never go above that. I do not have any speed limits set up at all. I then tried sending him the same file through FileZilla and instantly maxed out my upload speed so I'm more than sure that this is not a connection or a hardware related problem.

What is the expected behavior

qBittorrent was supposed to max out my upload speed (500Mbps) knowing I was only seeding to one client (my friend).

I would love to see this problem being fixed as qBittorrent is a truly amazing and nearly perfect piece of software which I always use.

ACFiber

FranciscoPombal commented 4 years ago

3 questions/requests:

ACFiber commented 4 years ago

Thanks for your suggestions Francisco,

I do happen to have 16 GiB of RAM. The torrent was a public torrent and yes I was the only seeder.

FranciscoPombal commented 4 years ago

@ACFiber try to set your cache manually to some high value then, like I suggested above.

If the problem persists, then please:

  • Post settings file + logs (the relevant log from when the transfer happened or at least the most recent on you got). Make sure to redact any potentially sensitive data in these files before posting.

and we'll go from there.

FranciscoPombal commented 4 years ago

@ACFiber 4.2.4 is now released You can simply upgrade to it and use disk cache set to auto instead of trying the workaround mentioned above:

qBittorrent 4.2.3 has a known bug (in libtorrent) with setting disk cache on Windows for systems with >= 16 GiB RAM. If this is your case, try setting the disk cache manually to 128 MiB or above (256 MiB, 512MiB, ...), and see if the problem persists. This bug is now fixed in libtorrent and version 4.2.4 will have no problems with auto disk cache on Windows with >= 16 GiB RAM.

verdy-p commented 4 years ago

There's a very likely cause of this problem when downloading large files: if the fragments are not downloaded sequentially but near the end of the file, writing these fragments to the file system can cause excessive I/O for writing many null bytes.

So even if the source peer could read and deliver contents at much higher speed without problem, it is in fact the downloading client that CANNOT request more data from the source because that client is already too much busy, waiting for the completion of very large volumes of (unnecessary) writes.

Even if files are downloaded sequentially, writing them progressively to the filesystem with progressive growth of the filesize causes also unnecessary stress to the filesystem due to many successive allocation for extra space.

You'll observe then that very large files (e.g. one or several dozens of gigabytes) are downloaded very slowly at start (except for a few megabytes that the client can download into memory buffers without having to flush them immediately to disk), only because the filesystem is too much busy with its I/O queue waiting for completion of writing many null sectors

(if you look at the system performance monitor, you'll see that ALL disks used by the filesystem volume on which you are writing the downloaded file are used at 100% of their I/O time for several minutes, and during all that time the network download speed falls to 0 from ALL sources, even if they are still connected and ready to service your requests; then if the downloading client is busy for too long waiting for completion of its writes, these sources will disconnect themselves because the client was silent to them; the downloading client will then have to reconnect later to these sources or find new ones, and may even be denied to reconnect for quite a long time to the same sources, because the client "abused" the occupation a connection slot on the source peer for too long);

Once this is "initialization" with null sectors is done in the client, the download will restart at normal speed (if there are sources still connected, otherwise the client will have to find new sources): it is clearly not a limitation of bandwidth the network or from sources, but purely a local limitation of how downloaded files are written to disk (with excessive use of the disk bandwidth and excessive flushing of the filesystem cache by these large amount of written bytes, even if these bytes are all zeroes!).

It makes NO sense at all to NOT preallocate these files (whatever their actual filesize), just to see a bit later that the files cannot fit the remaining free space because qBittorrent has attempted to write a file fragment too far away from the start, this should be done early even before trying to download the first fragment (from any position in file) !

One way to solve it would be to first start allocating the whole filesize on disk, but NOT by using file writes with null bytes (which is very slow, notably on RAID arrays, whiuch instantly will experiment a huge spike of activity), but by using the filesystem call to preallocate the file size (the filesystem will allocate enough clusters on disk, but will track that these clusters are still not filled with significant data, so reading from them will just return a buffer filled with zeroes, without even having to read them from disk, so this also saves a lot in the filesystem cache.

The Win32 API includes a function that can be used to preallocate space "implicitly" filled with zeroes for files: the allocation is commited to disk, but actual clusters are not written with NTFS which can track the "preallocated clusters" that must be read as zeroes (or will be written passively at slow speed and low priority with FAT32/vFAT/exFAT). Using the APÏ significantly improves the performance for writing files with known filesizes, and has the additional benefit of reducing a lot the fragmentation : only a single or just a few very small entries in the clusters map for the file is needed (on FAT32/vFAT/exFAT, the FAT map can be written also very fast and allocated without excessive fragmentation, but a background task of the system will fill these clusters asynchronously and will also capture attempts to read these clusters before they are actually flushed with zeroes; also on flash storage with NTFS, sectors filled with zeroes are not written at all, as opposed to the case where you just use normal file writes with arbitrary byte counts).

The same remark can apply to any other torrent downloader on any OS and filesystem: as much as possible, you should preallocate the storage space, using the best API for each OS that does not require using filewrites with buffers filled with zeroes: such API also exists in Linux! All you have to do normally is to use a "lseek()" call or similar, with some flag indicating that it must go "past the current end of file" instead of returning an error (the OS may return an error if this cannot be satisfied because of lack of available free space on the filesystem, or because of exhaustion of the write quota allowed for the current user or process); then you wan write to the file from that position (the file must just have been opened with a "random access mode", and not just a "sequential access mode").

You can experiment this problem when trying to download any "planet file" from OpenStreetMap database dumps (their individual filesize is now about 50 Gigabytes): a fast start for a brief time, immediately followed by several minutes during which disks of the target filesystem are used at 100% of their bandwidth: you can be still connected to a dozen of sources, but the reception speed goes very rapidly to 0 bytes/second, even if your network bandwidth is completely unused and all the remote peers are just waiting for your local torrent downlaoder to request more data.

zecanard commented 4 years ago

If I’m understanding this correctly, this would explain why on my previous computer, qBittorrent would hang for a very long time (10 minutes or more on very large torrents, eg. 50+ GiB), even with “Pre-allocate disk space” unchecked. The torrent would be added very quickly to qBittorrent, and everything would keep running just fine. But then the moment that new torrent started receiving data, qBittorrent would freeze with high disk activity. Once the freeze ended, it wouldn’t happen again for that torrent although (if I added a new one, it would freeze again).

That was on a 7200-RPM disk running APFS (which supports sparse files). It may even have led to premature/excessive wear and tear, since that drive died after just 5 years, without particularly heavy disk utilization outside of torrenting. My current all-SSD setup does not exhibit such issues because it’s fast enough to blaze through that initial allocation.

verdy-p commented 4 years ago

Even on a fast SSD (M.2), it makes NO sense at all to NOT preallocate these large files: your MAC is still writing gigabytes of null sectors before it can fill them with useful data. And during that time, you've completely and unecesssarily flushed all your filesystem cache, causing notable performance degradation for any other competing I/O from any other process using the same storage volume.

Not preallocating these files will stress a lot the SSD, and will accelerate its wearout (because these sectors will be actually written twice: the first time with massive fills with zeroes, unless the SSD hardware or the filesystem in the OS handles specially theses sectors that are intended to be all-zeroes, the second time with the actual downloaded data).

Yes, we need preallocation of downloaded files (and this MUST be the default for downloading any file that is larger than about 4 megabytes, or if downloads are not performed sequentially, or not performed in random order but limited to a maximum "sliding window" of 4 megabytes).

If your system supports "sparse files" (APFS, NTFS, some Linux filesystems) this hould be used as much as possible, but if you don't, and you just use normal filewrites with buffers filled with zeroes, you won't gain anything. Preallocating files with the correct API will make use of the "sparce files" support in the filesytem, it will be very fast (even on rotating hard disks and RAID arrays), will preserve the performance and efficiency of the filesystem caches (and other competing processes on the system), will preserve the lifetime of your physical volumes, and will allow much faster downloads of torrents, even in random order from any starting position.

But the random order should still use a "sliding window" whose size should not exceed a few megabytes. DO NOT move the slowing window before you've downloaded and written all bytes at start or end of the sliding window, but continue downloading from remaining "holes" in that window.

You may want to extend this window size only if all space in the "sliding window" is already ordered to connected peer sources and all these sources have a non-zero download rate (averaged on the last minute), but you have new candidate sources, but be conservative in how you extend this size and do not allow new peers to extend from any random position in the file, but only onto file fragments that are the closest to the current sliding window.

A tuning parameter could also be used to allow qBittorrent users to change the maximum download window size (I suggest 4 megabytes which would be fine for RAID arrays of rotating disks with parity disks, it could be up to 64 megabytes on a single SSD, or arrays of SSD using only striping).

Using mixed disks (with a SSD front cache and rotating disk backends) or any other caching technics will not play any role: consider only the slowest storage part, and take into account the need to synchronize and read from all other disks if you're using RAID with parity and your applicationj is not performing I/O filling ALL the space of a the same RAID striping band (a typical RAID stripesizes are 16 megabytes per disk, so if your RAID has 4 storage disks plus one or two paritydisks, your application would have to perform all its writes in the queue so that it will completely fill 64 megabytes at once so that nothing will need to be read from disks, and the parity disks can be written at the same time from the existing content of the write buffers for storage disks).

These are very technical and internal problems, hard to track in an application (this depends intimately on how the storage is physically organized), so the best in qBittorrent is just to let the user specify this download window size, and experiment himself which one gives the best performance without causing excessive usage of disks at start of downloads.

Note: this download window size is a total for ALL files that you'll want to download concurrently, even from distinct torrents, as long as they'll be downloaded to the same local target disks. If you download multiple torrents, or multiples files grouped in the same torrent, each downloaded file will use an independant part of that same window.

FranciscoPombal commented 4 years ago

I don't think it is normal for a download to hang for multiple minutes at the start with preallocation disabled. I have never experienced that even with 200 GiB+ files on 7200 RPM drives.

As for the fact that most torrent workloads are random I/O-heavy, that shouldn't matter for a single torrent being downloaded to basically any SSD, least of all a modern M.2 SSD. The SSD will still be fast enough to not be the bottleneck.

I do agree that qBittorrent could do preallocation better when it is enabled, though. I think this has been discussed in the past (very similar things to what @verdy-p said in his post), but I can't find that discussion now.

verdy-p commented 4 years ago

I experiment it constantly for starting to download ~50GB bytes onto my big RAID array (which uses groups of 14 "fast" disks, organized in 2 for parity, 2 for spare, 10 for storage striped by subgroups of 2, plus a SSD frontend, and a large SDRAM cache dedicated to each group of 14 disk+SSD), there are several of these RAID arrays that are mounted together with striping in the OS, and on which I have then created the logicial volumes for filesystems.

Each time I can see that a complete group of 14 disks is completely busy for several minutes with qBittorrent's attempt to start downloading any fragment from the end of large torrent files without preallocating their space. This is a stupid behavior.

So please:

This global "sliding download window" size is NOT an internal buffer to allocate in memory, it's a functional parameter to be used only to determine which file fragment you can safely start downloading from all sources at the same time, the actual fragments that you'll first request to each source peer (or that will be accepted by them in a single operation) will be usually much smaller than these 16 megabytes (which is also a good default size for striping/mirroring/parity bands in RAID storages).

With a larger global "sliding download window" size, you allow a larger number of concurrent source peers (in the rare case you can effectively download concurrently from hundreds of available sources; typical downloads however never use more than about a dozen effective sources at once and only by fragments of at most a few kilobytes at once on first attempt before some of these peers accept to grow the fragment size to deliver in one operation if they are not oversollicitated, so the window will almost never competely busy).

This window does not cause any limitation on the number of target peers to which you can swarm any file fragment that you have already downloaded or that you are sharing directly (they need only read access to these file contents, but the "sliding download window" is only limiting the write access to save the file contents).

verdy-p commented 4 years ago

So the summary tag "network" for this issue is simply wrong.

In fact this is absolutely not a networking problem, but a strictly local problem for "file" I/O (for writes only) and only in downloading peers (this does not concern at all the uploading peers).

And the initial suggestion of increasing the size in memory of the cache for the local filesystem is also wrong (it does not solve the problem at all, except for downloading small files not exceeding a few megabytes)

A workaround for users however is to download large torrents into a SSD or into RAMdisk with sufficient free space to fit the full files. and once the download completes, transfer these files (by sequential copy) to other storages.

You can use qBitTorrent and instruct it to download by default into such fast SSD or RAMdisk, but there's still no option in qBitTorrent to move the completely downloaded file into another location. (such option already exists in downloaders used by web browsers, which can use a fast temporary storage space for partially completed downloads, before moving the complete file to the target destination).

If you perform the move manually (for example by using the File Explorer), you first need to stop sharing the file you've just downloaded, and this is a bad strategy for keeping torrents alive on the network with good resharing rates.

We need the support in qBittorrent of an command that allows a user to instruct it to move a shared file to another location while continuing to share it, or to change the target of a download after it was started without cancelling the partial download (this may be needed if the initial target has insufficient space available)...

FranciscoPombal commented 4 years ago

@verdy-p We don't have sufficient info from the OP to determine whether or not they are experiencing the same issue, and due to the same issues you are describing. Frankly, at this point, you should open a separate issue report to talk about your problem with preallocation specifically. If it then turns out that these issues are related, we'll take appropriate action. But for now, this could very well be simply related to the disk cache bug in 4.2.3 that was fixed in 4.2.4, or due to some network performance reason.

verdy-p commented 4 years ago

This is VERY easy to see: start downloading a torrent with more than 128 Megabytes: the first 128 megabytes can be downloaded instantly (independantly of the fragment positions in files), then the download stalls as soon as the 128 MB write queue (standard for processes in Windows) si full if any fragment in the write queue outside falls far away from the start of the file, because that fragment is located gigabytes away from the start: the output queue if fulled by gigabytes of null sectors to write before being able to write that fragment after these null sectors (which are written needlessly only to extend the filesize, but their content has no meaning as these are only in parts of the file that are still not downloaded and were still not even requested).

If there's no support for sparse files, what can be done instead is to write only the fragments in a temporary file in random order (in the same order as they are received), and then another background job can reorganize that tempory space file to commit the data sequentially into the real file in the correct sequential order as soon as it is possible. and when doing that, any committed data that was read from the random-order temporary file and written to the real file in sequential order can be reused as a temporary workspace for other fragments loaded in random order.

But I do think that no fragment should even start being downloaded before the whole target file has been allocated on disk (even if this means writing gigabytes of null bytes if there's no support for sparse files on the target storage volume): only when the file is correctly and fully allocated (and there was no error returned for lack of storage space or exhaustion of user/process quotas on disk), normal download can start and can be committed directly to that preallocated space.

You'll see as well that just to write gigabytes of nulls, there's the same heavy usage of disk I/O (100% time on all disks of the storage volume) for the same long time (several minutes during which no download can effectively occur, except for fragments that are already preallocated and written on disk). But at least download can start almost immediately before the whole file is allocated and initialized, at least for fragments that are at the beginning of the file and are already initialized.

But starting to download immediately any fragment located gigabytes away from the start of any file is a bit //stupid// as there's NO warranty that there will even be enough storage space for it (qBittorrent does not perform any early check on the file size to see if there's enough free space, and the effective free space may also change without notice in the background due to other usages of the volumes, for other files created by other concurrent torrent downloads, or other files created by concurrent processes,

Preallocating space on disk is the ONLY safe way to go for any large download (any file that is larger than 128 megabytes), and downloading any fragment located outside the space that is //really// preallocated in the target volume is really stupid.

And no the "disk cache bug" in 4.2.3 is unrelated. This long startup delay (for files larger than 128 Megabytes) exists as well in qBittorrent 4.2.4 with this fix and exactly for the same reasons I described. It is easy to detect if the download comes from a fast fiber access (with gigabit/s download speed and frequently tens or hundreds of megabits/s from available torrent sources).

You cannot experiment this problem if your storage is a fast SSD but your Internet connection is DSL line or mobile with slow download speed, because the bottleneck is not from the storage disk but the network speed in that case, as your disk will still be fast enough. You'll experiment it if you have a fast internet, and fast torrent sources, but considerably slower storage (or if your storage is on a shared filesystem with strict usage quotas).

And this problem exists even if the target volume is on a fast SSD (the only difference being that and SSD is much faster to write gibabytes than a hard disk or a RAID array with mirroring and/or parity disks; so the delay on SSD is shorter, but this bug also creates accelerated wearing of the SSD due to the doubled total volume of writes: first to initialize sectors with nulls almost needlessly if the target volume is on a filesystem without sparse file support, then to write the effective downloaded data).

I have given here suggestions to avoid that, the best one (working on any type of physical disks and filesystems) being to commit fragments that are downloaded in random order to a temporary space in the same order as they were downloaded, for as long as the space for them in the real target file has not been allocated. To allocate this space for the final file, you can either

This temporary space then really works as a "cache on disk" (it can even be used to read fragments that you've downloaded, still not committed to the target file but that you can reshare instantly and without limitation to other peers). In all cases, this is a global cache (for all fragments from any file in any downloading torrents) whose size must remain below a maximum size (if all its existing space is for uncommitted fragments, and the temporary space that grows sequentially on demand has reached a maximum size so that any expansion causes excessive delays above a couple of seconds, it means that you should stop increasing its size; use that space in a MRU order: write first new fragments to the areas that were used by the most recently commited fragments, in order to maximize the efficiency of the file system cache for that cache file).

This temporary space does not have to be very large as it is reusable for any fragment (from any concurrently downloaded torrent) as soon as it can be commmited to the real files: reusing it will also maximize the performance of the filesystem cache that will frequently save many read/write to it. The advantage to this temporary space is that it can be potentially much larger than the filesystem cache in memory or any cache or buffers in memory inside qBitTorrent itself, and it allows better coexistence of qBittorrent with other concurrent processes on the system.

And committing data from the temporary space to the real space of files can be done more pacefully, without necessarily eating 100% of the disk I/O for several minutes, it can be done by a separate background task in qBittorrent, that just has to manage how fragment are allocated and mapped inside the temporary space, and it will still allow downloads to start almost immediately. Such commit to real target files will always be sequential, the final files will be almost always written only once instead of twice, without delays applied by the OS on the I/O write queue of the qBittorent process. And qBittorent will no longer have to maintain excessively large caches in memory (every cache in memory will be instead inside the OS managing the filesystem) so there's no risk that this cache in memory will be "paged out" to another temporary space on disk, causing severe degradation of performance for the rest of applications or system services on the same host.

verdy-p commented 4 years ago

Final note: the "temporary space" used to initialize large files at start of downloads does not have to be on the same filesystem as the target files. It is best to allocate it in the normal temporary space of the application's user (%TEMP%), which usually will be on a faster file system (a local SSD rather than a target large RAID). It will remain small, it can be deleted at any time when all the file fragments it contains temporarily have been commited to the target files.

It can also remain persistant as long as qBittorrent is running. It can also truncated to a lower minimum size (when it's no longer containing any uncommited fragment for downloads in progress), in order to keep it available instantly for any new download that the user may want to start. It is global for the application, shared for every file that starts downloading via any torrent (and not needed at all for downloading any small file, i.e. lower than about 4 megabytes and that can be written directly to the target volumes, but it must still be used if these files downloaded concurrently totalize 128 Megabytes or more, because your target disks will have their I/O bandwidth used in the same write queue for all these files, independantly of their individual filesizes).

The 128 Megabytes limit given above is the current default for the maximum volume allowed for the I/O write queue used (on Windows) by the qBittorent process, its value may be tuned by process settings or OS-level quotas, or applied forcibly by the OS when it detects that the process is using 100% of the output bandwidth of a set of target disks (in which case the OS will suspend the thread that wants to add new pending writes into the system queue, the OS allowing only concurent threads to continue working if they also don't want to write to the same target disks).

FranciscoPombal commented 4 years ago

Let's wait to hear back from @ACFiber and go from there.

verdy-p commented 4 years ago

Francisco Pombal wrote:

@verdy-p We don't have sufficient info from the OP to determine whether or not they are experiencing the same issue, and due to the same issues you are describing. Frankly, at this point, you should open a separate issue report to talk about your problem with preallocation specifically. If it then turns out that these issues are related, we'll take appropriate action. But for now, this could very well be simply related to the disk cache bug in 4.2.3 that was fixed in 4.2.4, or due to some network performance reason.

I have version 4.2.4 and still the same issue! This is definitely not a network performance issue (asserted on a Gibabit Fiber internet access, for which I have also a measurement of its current use, but also by a test on a local network to make sure this is not related to a bottleneck somewhere in the routing).

I can clearly see the huge volumes of null sectors being written into the new large target file and this always happens instantly after downloading the first 128 megabytes, then everything stops dwowloading as long as the target file has not reached its near maximum target capacity, because of the very early attempt to dowload (nearly instantly and with success) and save the last fragment (this saving stalls in the I/O queue, other fragments can be downloaded and added to the output queue to the drive without delay, until the maximum output queue is reached).

I consistently get more than 50 megabit/s (from torrents) from the first few peers, enough to download 128 Megabytes in 5-10 seconds, then it stalls for about 45-50 minutes if the target file to create in my RAID has 50 GigaBytes (it will stall for a bit more than 1hr30 if the file to create has 100 Gigabytes, the "stalled" delay is proportional to the total file size, while the same volume of 128 Megabytes is still downloaded in 5-10 seconds).

If I download the same file 50GB to a local SSD (not the RAID), the delay to initialize the 50GB file drops to about 2 minute, but it is still present.

The delay is in fact the roughly same as if I was performing a local copy of a 50GB from a SSD to the same target RAID (with the File Explorer, or with Robocopy which is just a bit faster while keeping the system more responsive but the disks also used constantly at 100% of their queuing capacity), the delay matches with the same volume of data to write on the target volume. I can reproduce the same delay in Linux, when writing to the same target disks the same volume of data with a simple "dd" command taking /dev/null as input (so this is also not a bottleneck caused by reading from disk, but only a bottleneck for writing such data quantity to the target disks).

Note: my RAID takes about 20 hours to synchronize an newly added 1TB disk to the same disk group, so it requires about 1 hour to write 50GB; very large files (especially collections of large files, including multiple versions) are usually not downloaded to an SSD but to hard disks or disk arrays, and for storing many versions of such big files, hard disks are typically mounted into RAID array. Even if the RAID array includes a frontal SSD cache, or a large frontal cache in DRAM backed by the local SSD, the DRAM or SSD caches will not help at all, as they're rapidly no longer used at all above some large volume of writes: you reach the maximum capacity of the slowest components, each individual hard disks, independantly of the topology of these disks in the storage cluster, all you can do to improve the speed is to increase the number of disk groups using independant disks and then stripe their total capacity; if the array includes mirroring or parity disks, the max performance for writes drops by about 30%, but if you double the number of volume groups, the performance grows by nearly 100%, as expected on a well-behaved array).

This bug in qBittorrent is clearly a bottleneck caused by excessive local I/O to disks that cannot support such massive volume of writes and be ready to do something else before a long time and not a bottleneck of the network. It is very easy to reproduce in fact and I don't understand you can't see that, unless in fact your internet access is really slow and you actually did not try to download a very large file with tens of gigabytes, like the OSM planet file.

See this RSS feed, updated with a new version of the OSM planet files added each week:

https://osm.cquest.org/torrents/rss.xml

Or the following magnet link for trackers of the last planet dump file (50.46 GB):

magnet:?xt=urn:btih:c69b14293048f1cca2f906d4caa9f52d793a7c96&dn=planet-200420.osm.pbf&tr=udp%3a%2f%2ftracker.opentrackr.org%3a1337%2fannounce&tr=udp%3a%2f%2ftracker.torrent.eu.org%3a451%2fannounce&tr=http%3a%2f%2ftracker.computel.fr%3a80%2fannounce&tr=http%3a%2f%2ftracker.cquest.org%3a6969%2fannounce&ws=https%3a%2f%2fftp.spline.de%2fpub%2fopenstreetmap%2fpbf%2fplanet-200420.osm.pbf&ws=https%3a%2f%2fftp5.gwdg.de%2fpub%2fmisc%2fopenstreetmap%2fplanet.openstreetmap.org%2fpbf%2fplanet-200420.osm.pbf&ws=https%3a%2f%2fftpmirror.your.org%2fpub%2fopenstreetmap%2fpbf%2fplanet-200420.osm.pbf&ws=https%3a%2f%2fplanet.openstreetmap.org%2fpbf%2fplanet-200420.osm.pbf&ws=https%3a%2f%2fplanet.passportcontrol.net%2fpbf%2fplanet-200420.osm.pbf

If I download the same file only sequentially from the same remote server by using HTTP, and not using a torrent to download it in random order (including the last fragment downloaded very early but commited to disk too early with the implicit initialization of the large volume before it), this delay disappears as because there's no initial huge volume of null sectors needlessly written, an I see an almost constant download speed at decent rate (however slower on average as there's a single source insteadf of multiple sources with torrents using them in parallel), but never any overlong time for 10 minutes or more with 0bit/s, then followed a forced disconnection by the remote source, like it occurs '''systematically''' from seeding peers for torrents (so this is also provably not a bottleneck somewhere in the routing via Internet).

If I use the same magnet with another torrent client, that performs ''only'' sequential downloads for large files (and never starting from random positions far away from the start of file) or that preallocates the target file on the filesystem before starting to download it, there are NO stalling for overlong periods after downloading the first few megabytes, like it occurs in qBittorrent.

The "solution" to use only sequential download is bad for the health of torrents for large files (like the OSM planet files, or the Linux ISO images that are typically several gigabytes): this results in a very slow (and unstable) growth of the number of seeding peers capable to distribute such large files. With sequential download, the end of file will only be available from the initial seeder, as most other peers will disconnect as soon as they've finished their download or after a very short grace time, without reaching the goal a minimum redistribution rate of what they downloaded. with sequential downloads, the initial source(s) remain single point of failures (if they disconnect, none of the remaining peers performing sequential downloads will be able to complete these download, they all depend on the initial seeder that remains alone with the rest of the file, but may no longer be available, or may not be able to satisfy the demands for all the other peers trying to get these missing parts).

Randomizing the download order of fragments should be the goal for all peers in a healthy torrent network, allowing a fast growth of the distribution, with many more seeding peers connected and participating (including all the peers that are downloading and still don't have the full file content, but that will help instantly redistribute any fragments they have already downloaded to help reduce the workload of the initial sources): the random download order removes the huge dependency on the initial seeders. This is specially important for the distribution of popular large files that are frequently updated but from sources with limited network capacity (like most Linux distributions, or providers of open databases): when there's a new update announced from these providers, many will attempt to download these large files simulteneously and that's where a torrent-based distribution can considerably accelerate and improve the distribution (without forcing providers to install and maintain a server with large capacity, which can be very costly for them, notably in open source and open data projects).

And an evident optimization of the random order is to select to download first (if possible) the fragments that have the lowest number of seeding peers having that fragment: once you've downloaded these fragments, the peer can instantly redistribute it to other peers while it will remain busy downloading other fragments: the network health will then grow rapidly, with all parts of the file rapidly covered by a minimum number of participants even if most participants are peers with incomplete files with download in progress.

If multiple fragments are available with the same number of sources, you should select to download first from sources that have the lowest completion level of the same file (these are only the best strategies, that can be bypassed only after first trial with these ideal sources, the initial seeders or other sources with the full file being the last to try for these fragments; if multiple sources have the same number of sharing for the same fragment and the same completion level, you can select these sources randomly before connecting to them)

Once you complete the download of a fragment from any source, and you are still connected with it, you may keep the connection and download any other fragment that this source has, but ideally also starting by fragments that have the lowest sharing level from other peers you know, and in random order otherwise.

The random download order (possibly optimized like described in the previous paragraph) only works well if you have preallocated the file on the target storage. I see absolutely no reason for not performing this preallocation first (and report an error to the user if that space cannot be allocated due to lack of free space or write access restrictions, or exhaustion of storage quotas) before downloading any fragment.

You could even select to download to a file that already exists on the target volume (including readonly files), without deleting its content first.

That preexisting file will first be verified fragment by fragment, to detect corruptions or missing parts, its size will be adjusted as appropriate (preallocating the extra space needed or truncating the existing file) if target file is not readonly, and then the download of missing/corrupted fragments will start if needed (except for readonly files that can't be updated; if the target file is readonly, then the file contains a different version, and download can only be made to another target file: qBittorrent can then open a file selection dialog to let the user select the new location or cancel the download).

This allows anyone to participate to an existing torrent with a local copy of a file that he already has, or allows a user to update an older version of the file with a newer version that will replace its existing local content, or allows a user to repair a locally damaged file without having to delete it and download it entirely again.

This last possibility also minimizes the local storage space requirement for large (non-readonly) files that need frequent updates, and will minimize the fragmentation level on the target volume.

ACFiber commented 4 years ago

Hello everyone, sorry for my late reply.

The problem seems to have improved with version 4.2.4. I am aware that version 4.2.5 is already out and I will be doing some testing with it as well.

FranciscoPombal commented 4 years ago

@ACFiber thanks for getting back to us. As soon as you know more, please post the results. In particular, it would be interesting to know if you experience the problem with disk cache set to auto.

ACFiber commented 4 years ago

@FranciscoPombal If I'm not mistaken, doesn't disk cache only matter for writing information (downloading)?

FranciscoPombal commented 4 years ago

@ACFiber It's the size for the read and write cache (search for cache_size here: https://libtorrent.org/single-page-ref.html).

cache_size is the disk write and read cache. It is specified in units of 16 kiB blocks.

verdy-p commented 4 years ago

So you did not reply to this question: how is allocation performed on disk ? The doc of qBittorrent says that the "full allocation uses writes of zeroes". This is not necessary at all, the win32 API has this function: Set EndOfFile https://docs.microsoft.com/en-us/windows/win32/api/fileapi/nf-fileapi-setendoffile

It makes the full allocation instantly (when increasing the filesize) or truncates the file.

If increasing the file size, no null sectors are written: the data between the old end of file and the new end of file is left in an "undefined" state.

But on NTFS volumes, which includes special handler for sparsely written files, this file size extension can be done almost instantly by creating a sparse area, marked as being filled implicitly with zeroes if ever they are read, and only the end of the cluster in the last sector containing the old end of file will be written with zeroes, so any attempt to read between the old end of file and the new one will not reveal past contents of the newly allocated clusters (which could be a security issue as they may contain data from old deleted files that were not blanked before deletion or truncation). https://docs.microsoft.com/en-us/windows/win32/fileio/sparse-files

Any file in NTFS (and some other filesystems that have support for sparse files) can be converted to use spare areas, filled with an implicit value (zeroes by default, but this may be changed): https://docs.microsoft.com/en-us/windows/win32/fileio/sparse-file-operations

The win32 API says explicitly:

"When you perform a read operation from a zeroed-out portion of a sparse file, the operating system may not read from the hard disk drive. Instead, the system recognizes that the portion of the file to be read contains zeros, and it returns a buffer full of zeros without actually reading from the disk."

And this works in ALL applications using basic file I/O, without being aware of sparse file capabilities of the filesystem.

"To determine whether a file system supports sparse files, call the GetVolumeInformation function and examine the FILE_SUPPORTS_SPARSE_FILES bit flag returned through the lpFileSystemFlags parameter. "

When you use SetOfFile() on a file, the Win32 API uses this FILE_SUPPORTS_SPARSE_FILES capability to determine how to prefill the new area:

"When a write operation is attempted where a large amount of the data in the buffer is zeros, the zeros are not written to the file. Instead, the file system creates an internal list containing the locations of the zeros in the file, and this list is consulted during all read operations. When a read operation is performed in areas of the file where zeros were located, the file system returns the appropriate number of zeros in the buffer allocated for the read operation. In this way, maintenance of the sparse file is transparent to all processes that access it, and is more efficient than compression for this particular scenario."

So even if the application is filling the area by "writing" zeroes in the gap, and the file has been prepared to allow regions of zeroes to be converted to sparse area, nothing will be written to the clusters, only the internal list of locations inside the file descriptor is updated (this list starts in the MFT in the NTFS file record, possibly augmented by extended attributes stored elsewhere in a separate stream if the NTFS record is too small to store the list of locations) along with the end of clusters not explicitly set to zeroes. This considerably accelerates the I/O operations for creating large files.

Of course, if your filesystem is FAT32/exFAT (which has no native support to store information for sparse areas) the only way to extend a file would be to write zeroes, but that's why the Win32 API says that the added space for the new file size may have "undefined" content: Win32 does not reset the contents of these sectors allocated from the free area of the FAT volume and the application should then initialize this area itself:

  1. First try using FSCTL_SET_SPARSE with DeviceIoControl() ( https://docs.microsoft.com/fr-fr/windows/win32/api/winioctl/ni-winioctl-fsctl_set_sparse https://docs.microsoft.com/fr-fr/windows/win32/api/winioctl/ni-winioctl-fsctl_set_sparse?redirectedfrom=MSDN) it may fail with an error on filesystems without sparse file support

  2. Use FSCTL_SET_ZERO_DATA ( https://docs.microsoft.com/fr-fr/windows/win32/api/winioctl/ni-winioctl-fsctl_set_zero_data https://docs.microsoft.com/fr-fr/windows/win32/api/winioctl/ni-winioctl-fsctl_set_zero_data?redirectedfrom=MSDN) to add a sparse region filled with zeroes (the response time will be almost immediate on a filesystem that supports sparse files, provided you converted the file to a sparse file, otherwise it will generate lot of writes on disk, and will take a time proportional to the difference between the old file size and the new file size to fit the new region (so this can only be used if you use if with the "full allocation" mode of qBitTorrent, and this should be done even before starting to download any fragment of the file (and this should be cancellable at any time)

"If you use the FSCTL_SET_ZERO_DATA control code to write zeros (0) to a non-sparse file, zeros (0) are written to the file. The system allocates disk storage for all of the zero (0) range, which is equivalent to using the WriteFile function to write zeros (0) to a file."

If the requested region cannot be fully allocated (FSCTL_SET_ZERO returned an error, such as lack of storage space or exhaustion of user quotas on disk), you can drop that file and inform the user: nothing has been downloaded, you've not wasted network resources from any source peer.

But once the full allocation is successful, you can start downloading data and store it (note that if the target file is effectively a sparse file, large areas full of zeroes in the downloaded file will actually not be written to disk, they will remain sparse holes: this also saves storage space because sparse holes are actually not allocated from free clusters, they exist only "virtually" in the internal location map of the file record of the filesystem): writes to disk will only be performed for actually downloaded data: you'll see downloads still starting very fast at maximum network speed, before being paced down by the I/O write queue if writing to storage is slower than the download from the network.

We should then no longer experiment long delays at start of the download for writing tons of zeroes, as they completely are filling the write queue to the target disks. and don't allow more writes to the same disk until these writes of zeroes are complete (note that writes to files are ordered, there's no way for Win32 to bypass this order by instructing it to write newer data before older data still not written, even if this older data is only made of zeroes: the writing of zeroes already in the queue must not be performed after after your later attempt to write other data, or this data would be later overwritten by uncommited but pending writes zeroes already in the write queue. It's important to understand because the writing of zeroes may already commited at file level (with success returned to the application) while the filesystem may continue with another lower-level write queue at filesystem level, or lower down at disk level (a filesystem may be created inside a lower system, such as a disk array, or in a partition of a storage space also used by other concurrent filesystems with their own queues: commiting to a filesystems is not sycnhronized with the last commits to the actual disks on which a volume was mapped and then used to create the filesystem)

Le sam. 25 avr. 2020 à 07:55, Philippe Verdy verdyp@gmail.com a écrit :

I have version 4.2.4 and still the same issue! This is definitely not a network performance issue (asserted on a Gibabit Fiber internet access, for which I have also a measurement of its current use, but also by a test on a local network to make sure this is not related to a bottleneck somewhere in the routing). I can clearly see the huge volumes of null sectors being written into the new large target file and this always happens instantly after downloading the first 128 megabytes, then everything stops dwowloading as long as the target file has not reached its near maximum target capacity, because of the very early attempt to dowload (nearly instantly and with success) and save the last fragment (this saving stalls in the I/O queue, other fragments can be downloaded and added to the output queue to the drive without delay, until the maximum output queue is reached).

I consistently get more than 50 megabit/s from the first few peers, enough to download 128 megabytes in 5-10 seconds, then it stalls for about 45-50 minutes if the target file to create in my RAID has 50 GigaBytes (it will stall for a bit more than 1hr30 if the file to create has 100 Gigabytes, the "stalled" delay is proportional to the total file size, while the same volume of 128 Megabytes is still downloaded in 5-10 seconds).

If I download the same file 50GB to a local SSD (not the RAID), the delay to initialize the 50GB file drops to about 2 minute, but it is still present.

The delay is in fact the roughly same as if I was performing a local copy of a 50GB from a SSD to the same target RAID (with the File Explorer, or with Robocopy which is just a bit faster while keeping the system more responsive but the disks also used constantly at 100% of their queuing capacity), the delay matches with the same volume of data to write on the target volume. I can reproduce the same delay in Linux, when writing to the same target disks the same volume of data with a simple "dd" command taking /dev/null as input (so this is also not a bottleneck caused by reading from disk and only a bottelneck for writing such data quantity to the disk).

Note: my RAID takes about 20 hours to synchronize an newly added 1TB disk to the same volume group, so it requires about 1 hour to write 50GB; very large files are usually not downloaded to an SSD but to hard disks and for storing many versions of such big files, hard disks are typically mounted into RAID array. Even if the RAID array includes a frontal SSD cache, or a large frontal cache in DRAM backed by the local SSD, the DRAM or SSD caches will not help at all, as they're rapidly no longer used at all above some large volume of writes: you reach the maximum capacity of the slowest components, each individual hard disks, independantly of the topology of these disks in the cluster, all you can do to impriove the speed is to increase the number of volume groups using independant disks and then stripe their total capacity; if the array includes mirroring or parity disks, the max performance for writes drops by about 30%, but if you double the number of volume groups, the performance grows by nearly 100% as expected on a well-behaved array).

This is clearly a bottleneck caused by excessive local I/O to disks that cannot support such massive volume of writes and be ready to do something else before a long time and not a bottleneck of the network. It is very easy to reproduce in fact and I don't understand you can't see that (unless in fact your internet access is really slow and youy actually did not try to download a very large file with tens of gigabytes, like the OSM planet file).

If I download the same file only sequentially from the same remote server by using HTTP or FTP, and not using a torrent to download it in random order (including the last fragment downloaded very early but commited to disk too early with the implicit initialization of the large volume before it), this delay disappears as because there's no initial huge volume of null sectors needlessly written.

Le jeu. 23 avr. 2020 à 11:54, Francisco Pombal notifications@github.com a écrit :

@verdy-p https://github.com/verdy-p We don't have sufficient info from the OP to determine whether or not they are experiencing the same issue, and due to the same issues you are describing. Frankly, at this point, you should open a separate issue report to talk about your problem with preallocation specifically. If it then turns out that these issues are related, we'll take appropriate action. But for now, this could very well be simply related to the disk cache bug in 4.2.3 that was fixed in 4.2.4, or due to some network performance reason.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/qbittorrent/qBittorrent/issues/12535#issuecomment-618304499, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAKSUG2WJIHLA25HOQFQAY3ROAF6TANCNFSM4MK5WU5A .

Wolfenstein98k commented 4 years ago

Just jumping in to add that I have had qBittorrent for years and I have never once seen it upload more than a few MBs, even if I download many GBs.

I have a typical hardware arrangement, I have manually reset all settings repeatedly, re-installed, etc... but I simply cannot seed with qBittorrent, and it's been this way for years, across hardware.

I'm happy to provide whatever info anyone requests if they make it clear what's requested. But I'm positive the issue isn't a bung setting on my end, as I have done my damnedest to both carefully set them as recommended, and also to totally reset them, and I've never been able to budge the upload. image

FranciscoPombal commented 4 years ago

Original problem is apparently resolved (most likely it was due to the 16 GiB RAM bug that has long since been fixed), and there has been no follow up from the OP, so I'm closing this.

@Wolfenstein98k

I will say the problem is most likely on your end, but still. Please open a new issue with all the info requested in the template, plus answers to the following questions: