Open machawk1 opened 10 years ago
https://github.com/agnoster/base32-js ?
(Imho the base32 choice is highly regrettable. Save 8 bytes on each warc record at the expense of interoperability with everybody else in the world. But I guess we're stuck now.)
Thanks, @nlevitt . Would you happen to have a reference WARC with uncompressed HTML (e.g., explicit
viewable in the WARC) to verify correctness between this library and what Htrix produces?Step 0 for WARCreate is interoperability. What is the alternative/ideal hash algorithm to use, iyho?
Have not yet found a way to consistently do this via JavaScript. Same data from Htrix WARCs return hex-like values from UNIX shasum but Htrix hashes have characters beyond this scope (e.g., "M"). The WARC spec says to use a 32 bit hash but I don't know how to do this.