bittorrent / bittorrent.org

387 stars 99 forks source link

does bencode dictionary allow duplicated keys? #153

Open trim21 opened 4 months ago

trim21 commented 4 months ago

Dictionaries are encoded as a 'd' followed by a list of alternating keys and their corresponding values followed by an 'e'. For example, d3:cow3:moo4:spam4:eggse corresponds to {'cow': 'moo', 'spam': 'eggs'} and d4:spaml1:a1:bee corresponds to {'spam': ['a', 'b']}. Keys must be strings and appear in sorted order (sorted as raw strings, not alphanumerics).

for example: d3:keyi1e3:keyi2ee for {"key": 1, b"key": 2}(python)

trim21 commented 2 months ago

apparently dictionary should not have duplicated keys, otherwise after sorting it will have different encoding result for same content.

for example, {"key": 1, b"key": 2} can be encode as both d3:keyi1e3:keyi2ee or d3:keyi2e3:keyi1ee

the8472 commented 2 months ago

bencode itself has no concept of encoding, that's an artifact of the language you're using.

BEP 52 clarifies this.

trim21 commented 2 months ago

bencode itself has no concept of encoding, that's an artifact of the language you're using.

BEP 52 clarifies this.

that's why I'm asking, it doesn't clarify if d3:keyi1e3:keyi2ee is valid bencode content.

trim21 commented 2 months ago

bencode itself has no concept of encoding, that's an artifact of the language you're using.

BEP 52 clarifies this.

encoding itself is a concept of programing language, but bencode content itself is not.

the8472 commented 2 months ago

{"key": 1, b"key": 2}

What I mean is the type distinction between string and binary is something that exists in the language. If that didn't exist you couldn't have duplicates there either.

But yes, the word "unique" could be inserted somewhere.

trim21 commented 2 months ago

{"key": 1, b"key": 2}

What I mean is the type distinction between string and binary is something that exists in the language. If that didn't exist you couldn't have duplicates there either.

But yes, the word "unique" could be inserted somewhere.

This problem here is just like url query, it's also a key-value pair, but allow duplicated keys. just being key-value pair is not enough.

the8472 commented 2 months ago

The spec was written by python developers (python 2 back then) who I figure understood dictionary to mean unique keys.

And yes, other implementations also require unique keys.

trim21 commented 2 months ago

The spec was written by python developers (python 2 back then) who I figure understood dictionary to mean unique keys.

And yes, other implementations also require unique keys.

thanks for your clarification, I just send a PR to document this, hope it can get merged.

(I don't know python 2 well but it has a unicode type I think?)