Open BenWiederhake opened 3 months ago
Great find! Is this a GNU issue or just for our implementation?
I'm not entirely sure what you mean? The GNU behavior seems self-consistent, we differ from GNU behavior, and aren't self-consistent. So I'd say that this is a bug in uutils.
The Base64 alphabet has, as the name suggests, 64 letters. 22 of these letters look like hexadecimal digits. That means that a random string of 8 Base64 letters (which encodes 6 bytes = 48 bits) has a chance of (22/64)^8 ~= 2^-12.3 to be a valid hexadecimal string. This means that generating a hash with an output length of 24 bits or a multiple thereof (e.g. SHA384 or Blake2b-48) might generate two different hexadecimal-looking hashes (of different lengths). This can cause all kinds of shenanigans with
cksum
, which has to detect/guess the encoding from the sums-file.In particular, here is a case where it goes wrong:
There are probably more bugs like this.
Note that this is not specific to blake2b: With SHA384, it would probably require around 2^99 attempts to find a file that hashes to a digest that triggers this bug. For reference, the Bitcoin mining community computes about 2^60 hashes per second according to some sketchy website, which is good enough for this thought experiment. So it would require about 17734 years to find that file. Okay, nevermind, this bug doesn't realistically affect SHA384. (But theoretically it does.)
Found while reading #6500 (probably unrelated though).
CC @sylvestre, because you seem to be interested in this kind of bugs.