webtorrent / node-bencode

bencode de/encoder for nodejs
MIT License
166 stars 36 forks source link

It sorts dictionary entries incorrectly. #142

Open issuefiler opened 1 year ago

issuefiler commented 1 year ago

Bug

https://github.com/webtorrent/node-bencode/blob/2fa2c7ea7d97791a0c7f0cb3dd0bb098c014738f/lib/encode.js#L80-L81

 // fix for issue #13 - sorted dicts 
 const keys = Object.keys(data).sort() 

This is not the correct way of sorting dictionary entries.


When you say “strings” in the context of Bencoding, you mean “binary strings,” or more specifically, “8-bit byte sequences.”

BEP 52 — The BitTorrent protocol specification version 2

Note that, in the context of bencoding, strings, including dictionary keys, are arbitrary byte sequences (uint8_t[]).

And Array.prototype.sort compares 16-bit units by default.

If compareFn is not supplied, all non-undefined array elements are sorted by converting them to strings and comparing strings in UTF-16 code units order.

The simple .sort() results in a different order (sorted_in_utf16) than the correct one (sorted_in_utf8). Observe:

const A = String.fromCodePoint(0xFF61);
const B = String.fromCodePoint(0x10002);
const sorted_in_utf8 = [A, B].sort((a, b) => Buffer.compare(Buffer.from(a), Buffer.from(b))); // [A, B]
const sorted_in_utf16 = [A, B].sort(); // [B, A]

Related issues