Open abitrolly opened 3 years ago
Is this about why it wasn't included or is this about what it takes to include it?
Speaking to the former: What's specified as a web standard heavily depends on what browsers agree to implement. Specifically, for a W3C Recommendation, you would need more than two interoperable implementations. When we specified SRI, the blake hash functions were relatively new.
Regarding the latter: It seems at least Gecko and Blink have an implementation of blake2b. If you want this to happen, you'd need to draft a Pull Request to this repository and write a "Explainer document" (a one-pager, you'll find examples on the web) to discuss the details, especially the merits, the risks and the reason why doing this now is necessary.
This is merely the first step. The implementation doesn't drive itself either, but that's on another page...
In the meantime: I suggest people find and document their non-conforming use of hash functions somewhere that makes it easy to find, so that other hash function names are interoperable and non-colliding.
The current specification text in SRI and the grammar defined in CSP require the hash-digest
construct to be separated by just one dash, so an underscore (_
) is advised
For the blake family of hash functions I would gently suggest the following prefixes, when used in an SRI-like setting:
(Unless there is some other precedent for putting them into a string that shouldn't contain a dash. If so, please tell me and use that instead)
Thanks for the explanation. That's pretty thorough.
Regarding PR for BLAKE inclusion and following the process, I am quite sidetracked already (https://github.com/yakshaveinc/tasks/issues/76), and would rather read the research from someone on a payroll, rather than doing the research myself. It was interesting to discover that such spec exists while I am trying to fit content addressing into handling of Python packages metadata (https://github.com/pypa/warehouse/issues/8254#issuecomment-903547058). One of the reason for the content addressing to be effective, is to agree on the the hashing function, so that you can validate that your already have the needed file without doing additional lookup which hash function you need to use for the lookup and doing lookups in all hash functions supported.
By the way, the PyPI (Python package index) tends towards sha256=0d87f879a3df4ad9389ab6d63c69eea078517d41541ddd5744cfcff3396e8543
format for integrity info that is embedded in URLs.
Although some projects say that hashes are no content identifiers, I still think they are. Here I would rather see the adoption of https://multiformats.io/multihash/ for the format, because checking integrity and detecting weak checksum algorithms is a task for an automated tool. Maybe there is no need for me to see sha256
or whatever in the URL to compare the checksum manually.
Looks like I can close this issue. Any objections?
@mozfreddyb I would rename this to "include BLAKE in the spec". It would be unfair to skip IETF RFC https://datatracker.ietf.org/doc/rfc7693/ just because nobody has time or paid contract to properly follow the process. Maybe I will get back to it in a few months.
For the blake family of hash functions I would gently suggest the following prefixes, when used in an SRI-like setting:
blake2s._224- blake2s_256- blake2b_384- blake2b_512- blake3-
These are pretty readable on their own.
blake3- blake2s128- blake2b384-
I've created the Gitcoin Grant while for this task BLAKE https://gitcoin.co/grants/3451/add-blake2-into-w3c-sri-spec but there seems to be little interest in anybody in sponsoring this activity. Maybe it is because Ethereum fees are high, maybe people are just don't see the reason when there is IPFS already.
See also #108 that proposes a better format for separating hash name, size and its value, which is relevant here.
https://www.blake2.net/