Closed squidneypoitier closed 7 years ago
Thanks for the report. But this is totally a libtorrent "issue". qBittorrent doesn't do the decoding. In fact all the bittorrent stuff is handled by libtorrent. We just query it for info and set the appropriate settings for it work. And as people already noted in the libtorrent issue you opened non-utf8 data aren't actually compliant.
@sledgehammer999 Actually, I think this is really more of a qBittorrent issue than a libtorrent issue. As I mentioned in the thread over there, I think that libtorrent should expose a mechanism for detecting these sorts of errors and a mechanism for getting at the raw bytestring, but beyond that it would likely be a bad idea for something so low-level to try to get "smart" about encodings.
Regarding the compliance issue - they are not compliant with the current standard, but older, legacy torrents still exist among long-lived torrents. As I mentioned Transmission handles these torrents just fine, likely because they are using some heuristic method to detect incorrect encodings.
My suggested solution is something in between what happens now and what Transmission is doing. I think that qBittorrent should detect when a torrent is improperly encoded, then in the "add torrent" dialog, it should show an "encoding warning" - possibly as a pop-up with a list of strings that failed to properly decode. Then the user can be presented with a list of alternate encodings (possibly ordered by whatever heuristic Transmission is using, like putting iso-8859-1 high on the list, possibly filtered by ones that decode the strings without issue). This will simultaneously solve the encoding problem and put users on notice that the torrents they are seeding / downloading are non-compliant (and prompt them to complain about it to whoever created the torrent, hopefully).
In terms of implementation, this can already be done a bit hackily without any changes in libtorrent
by doing what I've done above - load the file paths from libtorrent's interface, then load them again from the bdecode
of the .torrent file and compare the two to make sure that they match. Ideally, libtorrent will expose some interface for detecting decoding errors and it would be unnecessary to do the hack.
Versions
qBittorrent version and Operating System: 3.3.10, Arch Linux libtorrent: 1.1.1.0 Qt: 5.7.1
What is the problem:
Cross-posting this from arvidn/libtorrent#1780, because this project is affected and likely something needs to be done here even if changes are made upstream.
Quoting from that thread for convenience:
Transmission properly decodes
iso8859_2
strings, but qBittorrent fails on this score.What is the expected behavior:
Copied from the libtorrent issue, here is an MWE using a torrent with filenames encoded in
iso8859_2
:Here is the result, the proper behavior is on the left, qBittorrent's behavior is on the right:
Steps to reproduce:
The following Python script (Python 3) will create a minimally-working .torrent file that exhibits the improper behavior (also copied from the libtorrent thread):
Extra info(if any):
This issue may be related to #4479.