Open mickael-menu opened 3 years ago
Thorium seems to have some heuristic to fallback if the LCP hashed passphrase is not properly encoded.
@danielweck Did you encounter some issues in the wild?
"demoreader" @ cantookstation adds a base64 layer to the lcp_hashed_passphrase
hex encoding.
Okay that's expected according to the spec:
Note about the computation of the base64-encoded value of the hashed passphrase: from the hashed value of the passphrase, expressed as an hex-encoded string, calculate a byte array (32-bytes / 256-bits binary buffer); for instance, “4981AA…” becomes [49, 81, 170, …]. The expected value is the Base64 encoding of this byte array. Note that a base64 conversion is usually implicitly applied to byte arrays when converted to json structures.
Ah yes, I have to update the console messages in Thorium :) https://readium.org/lcp-specs/notes/lcp-key-retrieval.html#the-lcp_hashed_passphrase-element
We've had this dual hex / base64+hex string handling code in Thorium for a while, because at the time the test OPDS feeds implemented different syntaxes. From memory, I'm not sure which ones used the correct base64+hex method.
const b64Str = Buffer.from(hexStr, "hex").toString("base64");
const hexStr = Buffer.from(b64Str, "base64").toString("hex");
(execute on the command line with node filename.js
)
const crypto = require("crypto");
const pass = "LCP passphrase この世界の謎 \uD83D\uDE00"; // 0x1F600
const checkSum = crypto.createHash("sha256");
checkSum.update(pass);
// "ee06d6201c88d0e51c123cb0464a68bb8b0d4c6b0c413400320375c36b887759"
const hexStr = checkSum.digest("hex");
// ENCODE:
// "7gbWIByI0OUcEjywRkpou4sNTGsMQTQAMgN1w2uId1k="
const b64Str = Buffer.from(hexStr, "hex").toString("base64");
console.log(`--------- ENCODE:
"${pass}"
=> "${hexStr}"
=> "${b64Str}"`);
// DECODE:
// "ee06d6201c88d0e51c123cb0464a68bb8b0d4c6b0c413400320375c36b887759"
const hexStr_ = Buffer.from(b64Str, "base64").toString("hex");
console.log(`--------- DECODE:
"${b64Str}"
=> "${hexStr_}"`);
const b64Str = self.btoa((new Uint8Array(hexStr.match(/.{1,2}/g).map(b => parseInt(b, 16)))).reduce((s, b) => s + String.fromCharCode(b), ""));
const hexStr = Array.from(self.atob(b64Str)).map(c => c.charCodeAt(0).toString(16).padStart(2, '0')).join('');
(copy+paste into Web Inspector to execute, see console log)
(async () => {
const pass = "LCP passphrase この世界の謎 \uD83D\uDE00"; // 0x1F600
const checkSumArrayBuffer = await crypto.subtle.digest('SHA-256', (new TextEncoder("utf-8")).encode(pass));
// "ee06d6201c88d0e51c123cb0464a68bb8b0d4c6b0c413400320375c36b887759"
const hexStr = Array.from(new Uint8Array(checkSumArrayBuffer)).map(b => b.toString(16).padStart(2, '0')).join('');
// ENCODE:
// "7gbWIByI0OUcEjywRkpou4sNTGsMQTQAMgN1w2uId1k="
// Method 1 (from hex string)
const b64Str1 = self.btoa((new Uint8Array(hexStr.match(/.{1,2}/g).map(b => parseInt(b, 16)))).reduce((s, b) => s + String.fromCharCode(b), ""));
// Method 2 (from hex buffer, and simplified fromCharCode apply)
const b64Str2 = self.btoa(String.fromCharCode.apply(null, new Uint8Array(checkSumArrayBuffer)));
console.log(`--------- ENCODE:
"${pass}"
=> "${hexStr}"
=> "${b64Str1}"
=> "${b64Str2}"`);
// DECODE:
// "ee06d6201c88d0e51c123cb0464a68bb8b0d4c6b0c413400320375c36b887759"
const hexStr_ = Array.from(self.atob(b64Str1)).map(c => c.charCodeAt(0).toString(16).padStart(2, '0')).join('');
console.log(`--------- DECODE:
"${b64Str1}"
=> "${hexStr_}"`);
})();
Thanks Daniel! These JavaScript code samples could be really useful to build a static HTML page to help validate an implementation with a dynamic form to decode/encode an input.
I think that the key takeaway is that the base64 encoding layer applies to the hex buffer, not to the hex string (mistake easily made). So once this is clear, it really just boils down to using a suitable standard library or third-party APIs to convert the string / byte sequence. There are a few different ways of doing this on the Open Web Platform, if you know a simpler method please chime in :) (my current proposal is quite convoluted)
I need to update my NodeJS and WebJS examples with unicode characters and surrogate pairs (in the original non-hashed LCP passphrase), just to make sure I am converting the byte sequence correctly.
FYI, I updated both the NodeJS and WebJS examples with unicode (i.e. Japanese characters and a smile emoji 0x1F600
represented by its surrogate pair):
"LCP passphrase この世界の謎 \uD83D\uDE00"
Here is an alternative Kotlin example, using java.util.Base64
instead of android.util.Base64
:
import java.util.Base64
val hexStr = Base64.getDecoder().decode(base64Str)
.map { String.format("%02x", it) }
.joinToString(separator = "")
Kotlin REPL:
import java.util.Base64
fun main() {
val hexStr = Base64.getDecoder().decode("7gbWIByI0OUcEjywRkpou4sNTGsMQTQAMgN1w2uId1k=")
.map { String.format("%02x", it) }
.joinToString(separator = "")
println(hexStr) // "ee06d6201c88d0e51c123cb0464a68bb8b0d4c6b0c413400320375c36b887759"
}
Swift REPL(it):
https://replit.com/@danielweck/LCPPassBase64Hex
import Foundation
let hexStr = Data(base64Encoded: "7gbWIByI0OUcEjywRkpou4sNTGsMQTQAMgN1w2uId1k=")
.map { [UInt8]($0) }?
.map { String(format: "%02x", $0) }
.joined()
print(hexStr) // "ee06d6201c88d0e51c123cb0464a68bb8b0d4c6b0c413400320375c36b887759"
PHP REPL(it):
https://replit.com/@danielweck/LCPPassBase64HexPHP
<?php
$hexStr = base64_encode(hash('sha256', "LCP passphrase この世界の謎 😀", true));
echo $hexStr; // "7gbWIByI0OUcEjywRkpou4sNTGsMQTQAMgN1w2uId1k="
A somewhat related issue: https://github.com/readium/lcp-specs/issues/52
I'm opening this issue to gather code samples showing how to encode or decode the
lcp_hashed_passphrase
key as described in Readium LCP Automatic Key Retrieval.Encoding
PHP
Courtesy of the Internet Archive.
Decoding
Kotlin
Vanilla JVM
Android
Swift