rgrinberg / bencode

Bencode (.torrent file format) reader/writer in OCaml
23 stars 4 forks source link

Extract a subsection of the original bencoded input #15

Open kit-ty-kate opened 2 months ago

kit-ty-kate commented 2 months ago

The BitTorrent Protocol Specification specifies an info_hash value defined as:

The 20 byte sha1 hash of the bencoded form of the info value from the metainfo file. This value will almost certainly have to be escaped.

Note that this is a substring of the metainfo file. The info-hash must be the hash of the encoded form as found in the .torrent file, which is identical to bdecoding the metainfo file, extracting the info dictionary and encoding it if and only if the bdecoder fully validated the input (e.g. key ordering, absence of leading zeros). Conversely that means clients must either reject invalid metainfo files or extract the substring directly. They must not perform a decode-encode roundtrip on invalid data.

Currently in the bencode library there doesn't seem to be a way to keep around the original bencoded substring corresponding to a part of the structure, so currently I'm simply re-encoding the decoded data.

So given the above note, I'm wondering:

kit-ty-kate commented 2 months ago

after a short chat with @c-cube I'm realising that the spec talking about "fully-validated input" made me stray away from what I really want to know:

Since it's not, my question then becomes only: