qbittorrent / qBittorrent

qBittorrent BitTorrent client
https://www.qbittorrent.org
Other
28.62k stars 4.01k forks source link

QB moves multiple complete torrents at a time, leading to super high IO contention. #9407

Closed CyrusNajmabadi closed 4 years ago

CyrusNajmabadi commented 6 years ago

Please provide the following information

qBittorrent version and Operating System

QB: 4.1.2 OS: Win10Pro 1803

What is the problem

I have my qb set to download torrents to a very fast SSD array. When complete, those torrents are then moved to a much slower HD array. However, QB is now moving many completed torrents over at hte same time. This leads to huge contention on the HD array, tanking moving speed. i.e. instead of being able to move at 100+ MB/s, the contention makes the overall transfer speed go at 15-25 MB/s.

QB didn't use to work this afaict. It used to move the completed torrents over one at a time.

thalieht commented 6 years ago

The cause was identified https://github.com/qbittorrent/qBittorrent/pull/9373#issuecomment-414839983

CyrusNajmabadi commented 6 years ago

@thalieht the conversation you linked to seems to be about "checking" torrents, not "moving" torrents . I can imagine the same issue is causing both problems. However, i wanted to actually check to make sure that this really is the same problem.

Thanks!

CyrusNajmabadi commented 6 years ago

An example of three of the moves happening at the same time, totally tanking perf to an abysmal 30ish mb/s:

image

thalieht commented 6 years ago

Oops didn't notice the key word. I just moved 2 queued torrents (auto_managed) at the same time from one disk to another and Resource monitor showed activity on the Read column for both of them so i suppose this does not have the same cause as when checking.

CyrusNajmabadi commented 6 years ago

and Resource monitor showed activity on the Read column for both of them

Ok great. So it's not just me. :)

As the original post, this is fairly brutal give how bad HDDs will perform when there is a lot of contention on them.

CyrusNajmabadi commented 6 years ago

Is there any chance that this can be looked at. The regression is super severe and has made qb unusable on my machine. To put it in perspective, here's the backlog of files to move in the qb queue:

image

The list goes far beyond that, but this is all i could fit in a screen. Effectively, because of the excessively slow IO (due to copying multiple files at a time to HDDs), my system is now nearly 10 days behind on actually writing out the data that i've downloaded to the final destination.

Copying multiple files at the same time to the same location makes no sense for HDDs as it means that instead of being able to just go at full throughput speed to a single place in the drive, the drive has to constantly jump back and forth to different locations, decimating the throughput rate.

Seeker2 commented 6 years ago

May be similar to this issue: https://github.com/qbittorrent/qBittorrent/issues/9120#issuecomment-399684532

CyrusNajmabadi commented 6 years ago

Is there any chance this can be looked at? The performance here is still terrible. I basically can only download one torrent at a time since qbittorrent just absolutely hammers my disks when doing the final move. The contention means writes of only a few MB per second. Downloading multiple torrents is impossible since the download rate far surpasses the disk write speed during these moves.

CyrusNajmabadi commented 6 years ago

Hi Qbittorrent/@sledgehammer999. I tried reporting this issue over at libtorrent in case there was maybe something that could be done over there. From @arvidn's perspective:

It sounds like this is primarily a qBT issue. libtorrent could implement some queuing mechanism and detection of which volume is being copied to, but it seems like a stretch of the scope of the library.

libtorrent currently just does what it's told. If qBT asks it to move 3 torrents, they will be moved, and 3 I/O threads will be used doing it (if available). I think the simplest way to address this would be for qBT to have an option to not move torrents in parallel.

Would it be possible to not have qbittorrent try to queue so many moves simultaneously? As moving is going to be fundamentally throughput bound, i can't imagine how doing things in parallel ever really helps anything. Even if you're lucky, and your storage medium experiences no contention slowdown, the best you'll ever do is move things the same speed as you would have if you were doing things serially. Except that each individual file will still take longer to move than doing things one at a time.

Note: this used to not be a problem. I'm not sure if QB changed something in 4.x to do things in parallel with moves. But it would be nice if it could be rolled back. Thanks!

JamesG269 commented 5 years ago

This is still in an issue in 4.1.5. Selecting many files, and moving them to another drive, results in 3 simultaneous copies at a time according to Resource Monitor. After those complete, it starts 3 more. This results in transferring the files at 40-50MB/s instead of the 80-90MB/s the drives are capable of when doing one single copy.

iheartcsharp commented 5 years ago

Looks like libtorrent 1.2 will fix this: https://github.com/arvidn/libtorrent/releases/tag/libtorrent-1_2-RC2

arvidn commented 5 years ago

@iheartcsharp which change are you referring to specifically in that list? off the top of my head, I don't think libtorrent-1.2 changes this situation.

Also, libtorrent-1.2.1 has been released, that's a link to an old pre-release.

ctlaltdefeat commented 5 years ago

@arvidn "move files one-by-one when moving storage for a torrent" does this not refer to what OP means?

Also, what libtorrent version is qbittorrent on right now?

arvidn commented 5 years ago

I think that refers to how an individual torrent is moved. Instead of moving its entire directory, only files that are actually part of the torrent are moved. I.e. if some other files were put there, they won’t be moved.

Multiple torrents can still be moved in parallel.

iheartcsharp commented 5 years ago

I think that refers to how an individual torrent is moved. Instead of moving it’s entire directory, only files that are actually part of the torrent are moved. I.e. if some other files were put there, they won’t be moved.

Multiple torrents can still be moved in parallel.

Hmmm...I see what you are saying. So any idea/plans on limiting 1 torrent being moved at a time? It can just be a simple global limit for any torrents that are being moved.

arvidn commented 5 years ago

So any idea/plans on limiting 1 torrent being moved at a time? It can just be a simple global limit for any torrents that are being moved.

I don't have any plans to build that into libtorrent itself. It seems out of scope for the core library. I'm open to be convinced otherwise though, if someone has any ideas of how it could be implemented and also simplify the API/model at the same time. Just tacking on more complexity to the library does not seem like the right approach.

JamesG269 commented 5 years ago

Not a c++ programmer, but could you do something like just have an array of the drives' usage as source/destinations in moving and checking, and queue further operations on that drive? Here's some pseudo C# I wrote to illustrate:

int[] drives_used = int[26];

Threads_to_use = 4;
Threads_used = 0;
move_files()
{
    int i = 0;
    while (move_queue.count > 0)
    {
        while (Threads_used == Threads_to_use)
        {
            await Task.Delay(100);
        }
        if (drives_used[move_queue[i].file1.drive] == 0 && drives_used[move_queue[i].file2.drive] == 0)
        {
            Threads_used++;
            drives_used[move_queue[i].file1.drive]++;
            drives_used[move_queue[i].file2.drive]++;           
            Task.Run(() => { move_file(move_queue[i]); });
            drives_used[move_queue[i].file1.drive]--;
            drives_used[move_queue[i].file2.drive]--;
            Threads_used--;
            move_queue.remove(move_queue[i]);           
        }
        i++;        
        if (i == move_queue.length)
        {
            await Task.Delay(100);
            i = 0;
        }
    }
}
CyrusNajmabadi commented 5 years ago

Is there any chance this can happen? Moving torrents remains utterly brutal. It can take hours at these sorts of speeds. Any sort of change in torrent location can literally take days/weeks as it churns so slowly through all these files at a 4-5x slower rate than what we'd get if one file would happen at a time.

yaxv commented 5 years ago

I concur with the rest. It would be nice if files were moved one at a time.

EDIT: Then again, I suppose that a temporary solution is to move the files manually and update their new locations in Qbittorrent afterwards.

CyrusNajmabadi commented 4 years ago

@arvidn

I don't have any plans to build that into libtorrent itself. It seems out of scope for the core library. I'm open to be convinced otherwise though, if someone has any ideas of how it could be implemented and also simplify the API/model at the same time. Just tacking on more complexity to the library does not seem like the right approach.

How do you currently move files? Hard to give an idea on how it should be done without knowing that information.

CyrusNajmabadi commented 4 years ago

@arvidn can you explain how you check files one at a time?

arvidn commented 4 years ago

I think it should be covered here.

The short answer is:

arvidn commented 4 years ago

How do you currently move files? Hard to give an idea on how it should be done without knowing that information.

Who is "you" in this question? I don't think I follow, files are essentially moved by calling rename() and if that fails, the file is copied and the original is deleted.

if "you" is qBT, then it calls [torrent_handle::move_storage()](http://libtorrent.org/reference-Core.html#move_storage()) on the torrent handle.

arvidn commented 4 years ago

I don't think there's any disagreement that it would be better if torrents would be moved one at a time. There's just a shortage of engineers to make that happen (just like everywhere else in the world).

CyrusNajmabadi commented 4 years ago

Ok. So if you have a mechanism to only check one torrent at a time, can you not apply that same mechanism to moving one torrent at a time?

I.e. add an active_moving_limit value and use it in the same way you use active_checking_value

arvidn commented 4 years ago

sure I could. I don't think that's in scope for libtorrent though, but I'm willing to be convinced otherwise

CyrusNajmabadi commented 4 years ago

Can you explain the reasoning for why 'checking' (an extremely heavy io op that needs to process entire files) has support for being serial... But 'moving' (a very similar op in nature) does not?

I guess I don't see you made it possible to limit one but not the other. Having either happen in parallel is just not good at all...

arvidn commented 4 years ago

Sure!

CyrusNajmabadi commented 4 years ago

Moving is not a fundamental or inherent part of bittorrent.

Interesting. I got confused then. I thought it was something provided by your library, not something that qbittorrent was doing.

Moving can be done irrespective of the other state the torrent is in; while downloading, uploading or checking

That seems highy speculative and really not a valuable case to optimize for (esp. as it comes with such a high detriment). On most systems, if i'm moving a torrent, that's going to completely saturate the IO system. i.e. moving will happen using as much resources of the system as possible. Having that contend with other operations seems to just be harmful.

. I've sometimes been tempted to enable uploading and downloading for the portion that has been checked already, but it's not trivial)

Please don't do this. Like 'moving', the 'checking' operation completely saturates IO and will perform extremely badly if it needs to contend with other IO ops. This contention can literally lead to 3-5x worse performance in practice (esp. on spinny rust systems), for what looks to be extremely marginal gain.

If you want to support overlapping IO ops with these highly costly ops, i would only think that would be sensible as an opt-in system for users with systems that aren't so highly impacted. On normal systems, where concurrent IO just kills sequential high-throughput IO, it seems like it would be very bad.

--

Heck, we can see that with all the reports so far about how bad even something like concurrent checking was for people. It literally made things a minimum of 20x worse on my system. Who is benefiting at all from that sort of an approach?

arvidn commented 4 years ago

Interesting. I got confused then. I thought it was something provided by your library, not something that qbittorrent was doing.

libtorrent provides a function to move a torrent, and a notification for when it's completed. It has to be done via libtorrent (otherwise it would be a lot more complicated to stop, remove torrent, move storage, re-add torrent). doing it via libtorrent ensures all file operations interleave correctly with other bittorrent related file operations. qBittorrent could add additional logic to only call the move function for one torrent at a time. But it would require more lines of code, and some work to get done.

That seems highy speculative and really not a valuable case to optimize for (esp. as it comes with such a high detriment). On most systems, if i'm moving a torrent, that's going to completely saturate the IO system. i.e. moving will happen using as much resources of the system as possible. Having that contend with other operations seems to just be harmful.

I'm not optimizing for that case. I'm saying it's not a fundamental or inherent part of bittorrent, so I optimize for simplicity in libtorrent. I'm arguing that the additional logic (which, again, I don't think anyone is arguing against) should live in qbt, not in libtorrent. If you think it should live in libtorrent, please make arguments in favour of that.

I don't think it's useful to argue in favour of this logic existing in the first place, I don't think anyone is disagreeing with that.

Please don't do this. Like 'moving', the 'checking' operation completely saturates IO and will perform extremely badly if it needs to contend with other IO ops. This contention can literally lead to 3-5x worse performance in practice (esp. on spinny rust systems), for what looks to be extremely marginal gain.

Spinning rust is on its way out though, and disks that are near the speed of DRAM are in their way in.

If you want to support overlapping IO ops with these highly costly ops, i would only think that would be sensible as an opt-in system for users with systems that aren't so highly impacted. On normal systems, where concurrent IO just kills sequential high-throughput IO, it seems like it would be very bad.

It's not concurrent I/O that's the problem. Under normal conditions you definitely want a pretty deep I/O queue. The issue is with sequential I/O, which both moving and checking are. On spinning disks, that's a lot faster and suffers from cross-traffic of other disk I/O.

It is opt-in already, you can set the number of disk I/O threads to 1. (even to 0, to run all disk I/O in the network thread, but that's not a good idea).

arvidn commented 4 years ago

Heck, we can see that with all the reports so far about how bad even something like concurrent checking was for people. It literally made things a minimum of 20x worse on my system. Who is benefiting at all from that sort of an approach?

Everyone benefits from concurrent disk I/O in bittorrent, since the bulk of the disk access is random, you want the operating system and disk driver to have access to as many disk ops as possible, to optimize sorting. This is especially true for spinning disks, but with solid state disks too, as the throughput is so high, even small latencies can lead to large bandwidth delay products.

You don't see "bug reports" from people benefitting from "concurrent" I/O though, as it just works. You can't generalize the issue of concurrently running multiple sequential disk reads (i.e. checking and moving) to conclude all concurrent disk I/O is bad. I would argue that the "steady state" bittorrent logic hugely benefits from concurrent disk I/O.

as for concurrently checking torrents, I agree that it's not a good idea in general. libtorrent allows clients to do that, but it's not done by default.

CyrusNajmabadi commented 4 years ago

Everyone benefits from concurrent disk I/O in bittorrent, since the bulk of the disk access is random,

Not for checking/moving. That's the point of these issues. The same problems caused by concurrent checking also apply to concurrent moving. both are hugely sequential tasks that get absolutely obliterated by concurrent IO happening at the same time.

I'm not talking about the normal execution behavior when doing stuff other than this massive heavy sequential IO. During those times, concurrent IO is absolutely the right choice (And no one is asking that you change that).

as for concurrently checking torrents, I agree that it's not a good idea in general.

The same problems that arise with concurrent checking arise with concurrent moving. I'm not sure why there seems to be a disconnect here. Effectively, both have to saturate the IO system doing a full sweep of the entire file sequentially. They're practically the same in terms of the impact on the system. If anything, moving is worse because it will saturate one system with the read IO and another with the write IO. During this time, concurrent IOs from other moves simply host perform entirely as almost no systems can adequately handle this happening concurrently.

I'm really not getting how it was appropriate for the entire-file, sequential IO saturation of checking to be done serially... but the same is not true for moving. They're practically equivalent in the need here, and the problems that are incurred if they are done concurrently.

CyrusNajmabadi commented 4 years ago

If you think it should live in libtorrent, please make arguments in favour of that.

Right. I do think it shoudl live in libtorrent for two reasons:

  1. It would benefit all clients, not just a single client like qbittorrent.
  2. It matches and fits into the data model libtorrent already provides. i.e. concurrency for io where it is appropriate, and serial for where it isn't (i.e. checking). The only odd thing out currently is moving. Where it fits the same model as checking, but is put in the same codepaths as concurrency.
CyrusNajmabadi commented 4 years ago

I would argue that the "steady state" bittorrent logic hugely benefits from concurrent disk I/O.

I'm not arguing anothing about hte 'steady state' logic. I would not change that. I would change the logic specifically around when operations are happening that are known to absolutely saturate/thrash the IO system. Clearly this was done for checking. It's known that checking multiple files simultaneously does more harm than good. Moving is similar. Moving multiple files at a time does more harm than good most of the time.

Don't even think of them as checking or moving. Think of htem as both just an io op that can take a huge amount of time, and which doesn't work well when other IO is happening. In that case, both are just different user facing ops of the same sort of system-impacting behavior.

arvidn commented 4 years ago

It would benefit all clients, not just a single client like qbittorrent.

The same argument could be made for having a GUI, as long as you're assuming all clients are qbt and deluge (which is not a safe assumption). I think there are much more important (and closer to the core of bittorrent) that needs attention in libtorrent. In fact, adding a queue for moving torrents is pretty far down my list of things that need to be fixed in libtorrent. Especially given that I don't have a reason to believe it's a lot of work to do on the client side.

It matches and fits into the data model libtorrent already provides. i.e. concurrency for io where it is appropriate, and serial for where it isn't (i.e. checking).

But that isn't the model libtorrent provides right now. Here's the basics of queuing, and here's the torrent states.

I recognize that I am in the qBT project, so I don't want to hi-jack it. But if anyone want to address me on this topic, please focus on arguments of why such logic should be part of libtorrent, as opposed to qbt. It's possible that there are other people that are not convinced that torrents should not be moved concurrently, but I'm not one of them.

@CyrusNajmabadi your encouragement to think of it in certain ways are not such arguments as far as I can tell.

CyrusNajmabadi commented 4 years ago

@CyrusNajmabadi your encouragement to think of it in certain ways are not such arguments as far as I can tell.

I'm not sure what to say then. Both of the points here https://github.com/qbittorrent/qBittorrent/issues/9407#issuecomment-574822042 seem like reasonable cases ot make. They're not an argument in htat "this is right and that is wrong" but an argument of "this would be maximally beneficial" and "this is relevant to do for the same reasons that other changes were relevant to do".

--

Let me turn it around. libtorrent provides both checking and moving right? and in the case of checking, it (or you) seems to have acknowledged several times over that concurrency here was actually detrimental. It seems odd to me that hte library exposes both pieces of functionality, acknowedges that concurrency is not good, but then only supports avoiding concurrency for one sort of op.

It seems like a strange design. If moving were not part of libtorrent at all, i would understand this. but it looks like it is your lib that provides this functinoality. So it seems like a reasonable argument to me that if it provides this functionality it should do so in a manner that is not really bad for perf.

Now, if libtorrent didn't do anything wrt sequential/concurrent IO, i could see the arguments of either "this is out of scope" or "this is really not something libtorrent is capable of". But that isn't hte case afaict. It does support this concept and ensured that IO could be appropriated queued for perf. It just has chosen to only support it for one type of op, even though the library expose both ops.

CyrusNajmabadi commented 4 years ago

Moving is not a fundamental or inherent part of bittorrent.

You say this. but you also support moving as part of libtorrent. So i would say that if that's the case that libtorrent should strive to provdie that functionality in as low an impact fashion as possible.

Otherwise, it would likely be best to not offer this functionality and put it entirely on the onus of the client if they're hte ones that are going to need to do it effectively anyways. Right now it seems like the worst of all worlds. Clients likely think they can use libtorrent to handle IO well for all the ops it exposes. But it turns out that's only true for some of the ops.

arvidn commented 4 years ago

@CyrusNajmabadi it's a zero sum game. I don't grow more time to do more stuff just because something is a good idea. There's an incredibly large (and very easy to underestimate, I do it all the time) maintenance cost to adding features to libtorrent. Keeping features focused to the essentials is really important.

You seem to think that this is the most important thing that needs to be done in libtorrent, I'm suggesting that you only hold (or appear to hold) this view because of ignorance of other things that need to be done in libtorrent. Things that are more essential to bittorrent.

To be a bit more specific: You say "this would be maximally beneficial". Can you prove that? How many users are there of libtorrent, and what kinds of use cases do they depend on? Is this the one issue that most users of libtorrent suffer the most from?

One could implement this in a separate library (sort of like my old libtorrentwebui project on github), if the goal is to share code between projects. It's still not obvious that libtorrent need to have the functionality.

Otherwise, it would likely be best to not offer this functionality

Obviously any client is free to not use the torrent_handle::move_storage() function. It's well documented what it does (if you ask me at least). I don't see how removing this function would help you. It would simplify libtorrent, for sure, so it would help me. But as I explained, there are certain things that can't really be done from the client, since disk access needs to be interleaved correctly. Moving one torrent at a time is trivial in comparison, to do at the client level.

CyrusNajmabadi commented 4 years ago

You seem to think that this is the most important thing that needs to be done in libtorrent,

I never argued that at all. I was simply responding to your requests for an explanation of why i felt that libtorrent was the right place for this.

CyrusNajmabadi commented 4 years ago

To be a bit more specific: You say "this would be maximally beneficial". Can you prove that?

Yes. Or, at least 'prove' in the sense that the alternative is so highly unlikely as to not be interesting:

Doing this in qbitorrent means that only qbitorrent benefits. If other users of the lib exist that use it (but not through qbitorrent) they will not benefit. Once there is a single user outside of qbittorrent that would benefit from sequential moves, the above becomes true.

So for the alternate to be true, i would have to assume that every other client implements moving themselves in a sequential fashion. And even if that's true, my above statement still holds true because that means that all those clients had to implement their own queuing logic.

So it's maximally beneficial to still do it in libtorrent to benefit all those projects.

--

Note: "maximally beneficial" does not mean "more important than other work". it simply means "for this feature, there is hte largest bang/buck doing it in this one place".

CyrusNajmabadi commented 4 years ago

Moving one torrent at a time is trivial in comparison, to do at the client level.

I'm personally fine with that as well.

CyrusNajmabadi commented 4 years ago

@glassez Could you comment on the feasiblity from your perspective of qbittorrent queueing moves of files once complete? this is something i can potentially try to contribute towards. If i could get some information from the core devs about the architecture here and how we might go about serializing these long async operations we don't want to be concurrent. Thanks!

sgmihai commented 4 years ago

Please fix this already. I spent at least 10 hours manually managing and babysitting a large batch of torrents. I got up to 30-40 torrents moving all at once, causing the ssd to hdd moving speeds to stall hard, they got to be much slower than the actual download speed, which caused the temp ssd download drive to eventually fill, which caused all the torrents to stop, which needed pausing the torrents and rechecking them so they can start/finish (they would only start after a while, even if the disk space problem was solved). If no rechecking is done, some will be stuck at 99.9% until rechecking. What a trainwreck.

arvidn commented 4 years ago

Nobody is getting paid to do this work (at least as far as I know), so demands for fixing should ideally be backed by proposed patches. On the other hand, maybe someone ought to be paid for it, from the qbt donations.

sgmihai commented 4 years ago

I see that just their btc balance amounts to $110.000, and thus to say nobody gets paid for this work is a bit of an understatement.

arvidn commented 4 years ago

as far as I know, nobody has taken that money out to pay anyone. But, I'm not involved in qbt development, so I'm speculating.

arvidn commented 4 years ago

maybe it would be more effective to donate towards code bounties, than to just donate towards a general bucket.

sgmihai commented 4 years ago

If that's the case, they should just stop taking donations. Or tell the people it's for their retirement fund, and not supporting development. Seeing as this issue is at least 1.5 years old, and knowing of several other things that were repeatedly requested and took years to fix/implement, I am not very motivated to donate towards this project.

glassez commented 4 years ago

I see that just their btc balance amounts to $110.000, and thus to say nobody gets paid for this work is a bit of an understatement.

@sledgehammer999, can you comment?

xavier2k6 commented 4 years ago

@sgmihai

I see that just their btc balance amounts to $110.000, and thus to say nobody gets paid for this work is a bit of an understatement.

That amount is what was received in total since bitcoin was a donatable option & may or may not be accurate....market values fluctuate

Final Balance:2.83917644 = $24173.81 (approx - market values fluctuate)

If that's the case, they should just stop taking donations. Or tell the people it's for their retirement fund, and not supporting development.

Not supporting development???? - new release as of Dec 2019......Dev's active on issue tracker/even this thread & currently 50 WIP pull requests. Support is very much alive & development is very active!

Seeing as this issue is at least 1.5 years old, and knowing of several other things that were repeatedly requested and took years to fix/implement

Not all issues are qBittorrent related - they can be from different libraries that qBittorrent utilises like libtorrent/Qt/Boost/OpenSSL so it may be up to those authors/dev's to fix.

I am not very motivated to donate towards this project. Nobody is asking/forcing you to - it is (voluntary!) <-Just like the dev's working on this project is.

There are other alternative programs out there for torrenting.........

It's an opensource project so anyone is free to write the code themselves to create a fix/patch for a particular issue or feature request etc.