Closed glassez closed 4 years ago
When @glassez says "we" he means qbittorrent :)
This used to be the case, back when adding torrents always was done synchronously, and a pointer to an entry
could be passed in. The problem with that was that synchronous add is very expensive when adding lots of torrents, so it would have to be copied. The problem with that is that entry
is very expensive to copy. A plain buffer is much better for that. However, today, with bdecode_node
, it would be cheap to copy that, along with the plain buffer.
There is a potential API challenge in making sure that the provided bdecode_node
actually points into the resume buffer, and also that it's the root node of it (i.e. that it owns the parse state). Any ideas of how that could be solved? (for an introduction to bdecode_node
, see http://blog.libtorrent.org/2015/03/bdecode-parsers/)
Is lazy_entry
also expensive to copy?
yes, fairly. It still has the same problem though. It just points into a vector.
actually, it's a bit broader than that. if a bdecode_node
or lazy_entry
could be added to the add_torrent_params object
, that object would not be copyable anymore, because the parsed structure would still point to the old buffer. Maybe there needs to be a type that combines bdecode_node
and a std::vector<char>
.
IMO, The best solution is to use copy-on-write strategy for classes like libtorrent::entry.
yeah, that would make a lot of sense. although, even avoiding the copying of libtorrent::entry, just creating it is pretty expensive ( http://blog.libtorrent.org/2015/03/bdecode-parsers/ )
New bdecoding and implementation of copy-on-write will perfectly complement each other. However, I have some thoughts about new (and lazy) bdecoding:
Those point, although true from the user's point of view, are not true internally in libtorrent. lazy_bdecode came about were two reasons:
(1) loading large .torrent files would use more twice the amount of RAM as the .torrent file itself (I count that as degrading performance). (2) copying the buffer before parsing it has a performance hit, especially when parsing potentially large .torrent files. It's also expensive when parsing incoming DHT packets because there are many of them.
In the case of DHT packets, both the packet buffer and the decoded form of it are thrown away at the same time, when the packet has been handled and it's time to receive the next packet into the buffer.
That said, the way torrent_info deals with this is to combine bdecode_node with a boost::shared_array<> to the underlying buffer. It doesn't do copy-on-write unfortunately, but it copies and "rebases" the parsed state onto the new copy of the buffer (there is built-in support for that). Given that .torrent files are immutable, it could probably not even bother copying the buffer, but just increment its refcount.
My first suggestion was only a prerequisite to the second. My main idea is to encapsulate the buffer in entry class, but not to copy the source buffer into it, just move it there. Although I have not studied libtorrent code in detail, theoretically I see no reason why it can't be implemented here.
Hi @arvidn, do you think it is a good idea to simply add another "parallel" field decoded_resume_data
of type decode_node
to add_torrent_params
? (this is similar to the set ti, url, info_hash).
In terms of priority, session_impl should consider first decoded_resume_data
if it exists (since it's already decoded).
something like that, yes. It probably has to be something a bit more sophisticated though. For synchronous add_torrent it's fine, but for async_add_torrent() the add_torrent_params object has to be copied, and it would be nice to have a bdecode_node variant that could own the underlying buffer, perhaps held in a shared_ptr. I'm not sure what the most elegant approach is though. A new type that wraps bdecode_node + a buffer?
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.
Since we store some additional values in fastresume data, we then decode the data to extract these values. If libtorrent accepted to get already decoded data, it would help to avoid repeated decoding.