jesec / rtorrent

stable, high-performance and low resource consumption BitTorrent client
GNU General Public License v2.0
192 stars 38 forks source link

Optimize code paths of load_session_torrents #37

Closed jesec closed 2 years ago

jesec commented 2 years ago

co: jesec/libtorrent#5

Torrents / Time to Load (seconds) w/o this PR w/ this PR
100 0.0988322 0.100783
500 0.316298 0.2778484
1000 0.5230036 0.3384636
2000 1.1452586 0.488410
5000 5.4459174 0.8843604
10000 24.0999498 1.574791
15000 62.0685972 2.3090436
20000 122.2196932 3.118153
25000 194.3020272 3.9291368
30000 301.594044 4.7896626

image
main: batch checking after the whole session loaded

"receive_hashing_changed" is an O(n) operation, where n is the
number of entities in the download list. It is not efficient when
the function is triggered every time a torrent has been inserted
by "load_session_torrents", making the whole operation O(n^2).

Thus, delay checking until the whole session has been loaded.
view: add a fast path for newly added element

filter_download is often called after a new element is added.

Since new elements are always inserted to the back, it is more
efficient to check the last element first, so we have O(1) instead
of O(n) in this common case.
view: insert directly if sorting is not required

O(n) -> O(1)
main: only set view sorting and periodic resorting when not in daemon mode

Sorting is O(n*log(n)) and very expensive. RPC user (such as Flood
or ruTorrent) does not rely on rTorrent for sorted list.
download_factory: don't check for "removed-right-after-add" when loading session

As no command is attached when loading from session, it is
impossible for this edge case to occur.

"find" on the download list, a "std::list" linked list, is
an O(n) operation. With it, the session loading is O(n^2),
where n is the number of entities in the session directory.

As such, remove this unnecessary check to speed things up.
main: bypass queuing in "load_session_torrents"

It is more expensive to queue all the items and then dequeue and
process them one by one, due to O(n*log(n)) time complexity of
"priority_queue", where n is the number of tasks already in the
queue, and memory reallocation.

We do not need task queuing in the synchronous loading of session.
After all, we called "priority_queue_perform" right after queuing.

As such, call "f->set_immediate(true)", so at any time the queue
has at most 1 task, which speeds up session loading.
codecov[bot] commented 2 years ago

Codecov Report

Merging #37 (8273e52) into master (3082bc1) will decrease coverage by 0.00%. The diff coverage is 0.00%.

@@            Coverage Diff            @@
##           master     #37      +/-   ##
=========================================
- Coverage    8.33%   8.33%   -0.01%     
=========================================
  Files         128     128              
  Lines        7662    7666       +4     
=========================================
  Hits          639     639              
- Misses       7023    7027       +4     
Impacted Files Coverage Δ
src/control.cc 17.20% <ø> (+0.18%) :arrow_up:
src/core/download_factory.cc 0.00% <0.00%> (ø)
src/core/view.cc 0.00% <0.00%> (ø)

Continue to review full report at Codecov.

Legend - Click here to learn more Δ = absolute <relative> (impact), ø = not affected, ? = missing data Powered by Codecov. Last update 3082bc1...8273e52. Read the comment docs.