Open kirbyzhou opened 6 years ago
+1
This is really more of a bug in BitComet...
@ymte yes, BitComet hacks the protocol and generates those padding files, it sucks, but I still hope transmission can ignore them.
+1
hope add a global ignore file,
be similar to .gitignore
user can config it
hope add a global ignore file,
be similar to .gitignore
user can config it
very good idea!
Just ignoring them will not give all the benefits a padding file has (which is to allow deduplication over different torrents / torrent versions with some identical files without having to actually transfer the padding over the wire), and would require user intervention that I'd expect rarely to be accurate in practice.
There's three aspects to it that could largely be addressed individually:
^\.?____padding.*$
regular expression)any updates on how this is going?
@chrysn sounds like we need something like a virtual file layer, and intercept all read/write/verify operations on padding files.
That'd be one way to implement it -- but (without knowing the code) I'd expect that it's only two or three locations in the code that'd need changing, and then a full VFS layer might be extraneous.
@chrysn To make it simple, what if we just "unselect" the matching files when adding the torrent?
Possibly. I'd be afraid that this still creates the files and downloads data (for it is needed to verify the piece), and that they'd show as missing during verification, but as I said I'm only commenting from an outsider's perspective.
Well padding files are always filled with 0 bytes, so I don’t think we would need to download it if we always know what the content will be
On Sep 6, 2020, at 6:33 AM, chrysn notifications@github.com wrote:
Possibly. I'd be afraid that this still creates the files and downloads data (for it is needed to verify the piece), and that they'd show as missing during verification, but as I said I'm only commenting from an outsider's perspective.
— You are receiving this because you commented. Reply to this email directly, view it on GitHub, or unsubscribe.
This should be possible "in a clean way" if support for BEP 47 gets added and treats these __padding__
files as a special non-standard case.
This should be possible "in a clean way" if support for BEP 47 gets added and treats these
__padding__
files as a special non-standard case.
Part of the issue is that BitComet's "padding" files don't actually comply with BEP 47, right? Any (bep_0047 compliant) client will send you all the null-bytes you want if you are trying to download .pad
files from e.g. Transmission. None of those e.g. libtorrent peers actually have the .pad files stored, all of them know what the rest of the piece that contains them should be, etc.
I assume the same is also probably true for BitComet's __padding
, but without all of the users of several of the most common clients "at your disposal" should you try and request the rest of that piece with Transmission.
I have no idea how you'd be able to implement something to handle BitComet's version without actually implementing support for their "flavor" of padding entirely/separately.
Out of curiosity, I searched for __padding
on my computer and actually found some kicking around from a few years ago. I also still had the torrent.
{
"info" : {
"files" : [
{
"ed2k" : "removed",
"filehash" : "removed",
"length" : 123,
"path" : [
"%b0%e5%9d%80.txt"
],
"path.utf-8" : [
"地址.txt"
]
},
{
"length" : 262011,
"path" : [
"_____padding_file_0_%e5%a6%82%e6%9e%9c%e6%82%a8%e7%9c%8b%e5%88%b0%e6%ad%a4%e6%96%87%e4%bb%b6%ef%bc%8c%e8%af%b7%e5%8d%87%e7%ba%a7%e5%88%b0BitComet(%e6%af%94%e7%89%b9%e5%bd%97%e6%98%9f)0.85%e6%88%96%e4%bb%a5%e4%b8%8a%e7%89%88%e6%9c%ac____"
],
"path.utf-8" : [
"_____padding_file_0_如果您看到此文件,请升级到BitComet(比特彗星)0.85或以上版本____"
]
},
Translated
_padding_file_0_If you see this file, please upgrade to BitComet 0.85 or above
A BEP 47 torrent, like what you'd get from a client based on libtorrent or something, has padding files that look just like the actual documentation on BEP 0047:
{
"attr" : "p",
"length" : 123456,
"path" : [
".pad",
"0"
]
},
Improper/uninformed use of padding goes the other way, too, as in with BEP 47 (and aligning to piece boundary for all files, which isn't explicitly required!) not BitComet's BS:
Making a torrent of something like a music album with 10 individual FLAC files of varying sizes, totaling ~500 MiB? Padding isn't going to be disastrous if you enable it, and it will let "someone" grab single files 3 years from now when only partial seeds are left - and partial seeds using a client that supports BEP 47 will never have to store anything "extra" and can even verify against their "incomplete" pieces.
Assuming you used 256 KiB or 512 KiB pieces like you ought to in order to land at 1,000-2,000 pieces - at most there's going to be 2.5 or 5 MiB worth respectively, and in practice, it will be less, closer to 1.25 or 2.5 MiB total. Definitely an inconvenience for clients that don't (yet?) support it, like Transmission, but only a minor one.
Making a torrent of 10,000 files of sizes varying all the way from 1 byte to 100 MiB, totaling 1 GiB? Maybe it's something like a git repo clone (9,999 tiny files) + a release build (the single 100 MiB file).
Well, it's a 1 GiB torrent, so I'll just use 512 KiB pieces. 1,000-2,000 pieces, perfect!
Got padding enabled? Congratulations, now there are potentially 2.5 GiB of ".pad files" in your 1 GiB torrent, making things miserable for anyone using a client that doesn't support BEP 47 and all the peers stuck sending them the null bytes they're begging for...
The presence of padding files does not imply that all files are piece-aligned.
That's straight out of the spec itself, emphasis mine.
Note, "potentially" above - having e.g. qBittorrent align to piece boundary for all files (larger than 0 KiB) results in each individual (padded) piece representing only one file. I just tried it on a 900 MiB folder with 52 files ranging from ~10-30 MiB with and without padding, with 32 MiB pieces (don't do this). 51 padding files...torrent appears to be 1.605 GiB worth of data when including the padding. For the actual "Scenario 2" described, that insane 2.5 GiB number could be minimized by setting a more appropriate cut-off for the size of files that need to be padded - like the 512 KiB piece size itself as a bare minimum.
There are really only a few use cases where BEP-47 makes sense for a v1 torrent in the first place, like season packs of a TV show or something. 10-20 files, never the potential for >1% overhead vs. the rest of the data, lets someone who was only missing an episode or two start (cross-)seeding immediately, etc.
Anywhere else for v1, or where it's required (v2/hybrid)...you have to hope whoever is making the torrent is actually thinking about what they're doing.
BitComet will insert dummy padding files with zero-filled content into its torrent files. These files are useless, and their filenames start with "_____paddingfile".
Smart BT Client such as Thunder/XunLei will ignore these files while downloading.
For example: