Closed FranciscoPombal closed 2 years ago
@Seeker2 @arvidn FYI
@FranciscoPombal @thalieht Could this be labeled with discussion & maybe core/meta/libtorrent/performance or whatever ye may deem necessary yourselves.
Just did a quick test with qbitorrent 4.2.5 and libtorrent 1.2.6 and enabling this option did reduce the IOWait but it seems bandwidth was also reduced by around 200Mbps and load increased a bit :
Disabled
Enabled
Hardware: Intel 4 core (no HT) 16GB RAM, Samsung 970 plus nvme (xfs)
@apexlir what program is that? Is it prometheus?
Telegraf / Influx / Grafana with a 5s interval
interesting. In a way, you would expect that lower I/O-wait would cause a higher CPU usage (because it has to wait less), but the drop in bandwidth would be a bit of a mystery then. This option will probably make piece-picking a little bit more expensive, since it adds this additional constraint/bias, so that could possibly explain some of it.
I wouldn't expect that an NVMe drive to see the greatest benefit from this option. I suspect that all of these writes just go straight into the page cache anyway, but it may contributing to fewer page faults as writes are concentrated to fewer pages, and once a page is flushed it's unlikely to have to be faulted-in again, I suppose.
@apexlir Thanks for the benchmarks.
The difference in the scale and labeling of the graphs is a bit unfortunate, since it makes it quite hard to compare them, especially when the differences are so small. That being said, the drop in CPU I/O wait is quite noticeable, and the drop in bandwidth is small (in relative terms, about 3-4%), but noticeable as well.
The CPU usage looks about the same to me, but again it's hard to compare. Logically it should be higher, because the piece picker logic is more complex when piece extent affinity is ON
. I wonder if the CPU is actually becoming the bottleneck here.
Of course, I'll echo that one would expect hard drives to benefit the most from this setting. Furthermore, I would expect the performance uplift to be more significant if torrents are not downloading in sequential mode (I assume libttorrent's auto sequential mode kicked in for this benchmark run). It would be interesting to see both cases benched with hard drives, if someone is up for that.
@FranciscoPombal Perhaps it would be a good idea to add a table like below to the 1st post that can be editable (information/benchmarks will all be in the one place) with set information that @arvidn requires to aid/determine what is working right/what needs to be tweaked & it will also act as a guideline for those wishing to benchmark.
Perhaps, a census could be agreed - on what tool could be used best on Windows/Linux/macOS so that in a way everyone will be singing off the same hymn sheet!
Thoughts?
P.E.A OFF/ON | Torrent Size | HDD Or SSD | Piece Size | I/O | Bandwidth |
---|---|---|---|---|---|
ON | 208GiB | HDD | 256KiB | ABC123 | ABC123 |
ON | 340GiB | SSD | 4MiB | ABC123 | ABC123 |
OFF | 320GiB | SSHD | 8MiB | ABC123 | ABC123 |
OFF | 139GiB | SSD | 16MiB | ABC123 | ABC123 |
@xavier2k6 Unfortunately, to produce meaningful data for analyzing the more minute differences, I don't think such a table is sufficient. We need at least as much data/different metrics as @apexlir provided, with well-controlled environments and settings, and in a usable format for processing like csv
.
The torrent size is also not that relevant, as long as it's big enough to exhaust whatever caches there are (this also depends on the amount of RAM a user has, but what I'm saying is there is generally no need to test with 100+ GiB torrents for this purpose).
However, I expect the differences between the more extreme scenarios (e.g. NVMe vs HDD) to be visible even with less rigorous testing.
but what I'm saying is there is generally no need to test with 100+ GiB torrents for this purpose
I wasn't suggesting to use torrents that size for testing purposes...it was just an example/a quickly edited table of what I had used from a previous issue/thread.
The table usage was only a suggestion for "summary purposes" so all the relevant gathered info could be in one place.
This was a quick and dirty bench by no mean very scientific since I share the port and depend on 40 public seeders. CPU utilization was indeed up by 5%.
Did another run this time on a Soft RAID-0 of WD RED drives (btrfs) :
RUN1: ON RUN2: OFF RUN3: ON
I think we discard the first run but still a 6% drop in throughput between run2 and run3 :
CPU utilization was up by few percents but not very significant this time, curious to see the results on a standard line 100Mbs-1Gbps
I wonder does SSD benifits from piece_extent_affinity
.
https://github.com/qbittorrent/qBittorrent/pull/11781 was merged, so as promised in https://github.com/qbittorrent/qBittorrent/issues/11436, here is the thread for testing the performance improvement of
piece_extent_affinity
.Versions of qBittorrent with this option exposed in the advanced settings:
Libtorrent version required: 1.2.2 or later (if older versions are used, the option may still be available qBittorrent's WebUI, but will have no effect).
@xavier2k6 @fusk-l Feel free to post methodology, tests and results here.