readium / lcp-specs

🔐 Releases, drafts and schema for Readium LCP
https://readium.org/lcp-specs/
BSD 3-Clause "New" or "Revised" License
10 stars 5 forks source link

Code samples for the LCP Automatic Key Retrieval #51

Open mickael-menu opened 3 years ago

mickael-menu commented 3 years ago

I'm opening this issue to gather code samples showing how to encode or decode the lcp_hashed_passphrase key as described in Readium LCP Automatic Key Retrieval.

Encoding

PHP

Courtesy of the Internet Archive.

$lcpHashedPassphrase = base64_encode(hash('sha256', $passphrase, true));

Decoding

Kotlin

Vanilla JVM

import java.util.Base64

val passphrase = Base64.getDecoder().decode(lcpHashedPassphrase)
   .map { String.format("%02x", it) }
   .joinToString(separator = "")

Android

import android.util.Base64

val passphrase = Base64.decode(lcpHashedPassphrase, Base64.DEFAULT)
    .map { String.format("%02x", it) }
    .joinToString(separator = "")

Swift

let passphrase = Data(base64Encoded: lcpHashedPassphrase)
    .map { [UInt8]($0) }?
    .map { String(format: "%02x", $0) }
    .joined() ?? self
mickael-menu commented 3 years ago

Thorium seems to have some heuristic to fallback if the LCP hashed passphrase is not properly encoded.

https://github.com/edrlab/thorium-reader/blob/b282b1187d47caed8d7f25bf92810aa3d80760db/src/main/converter/opds.ts#L92

@danielweck Did you encounter some issues in the wild?

danielweck commented 3 years ago

"demoreader" @ cantookstation adds a base64 layer to the lcp_hashed_passphrase hex encoding.

mickael-menu commented 3 years ago

Okay that's expected according to the spec:

Note about the computation of the base64-encoded value of the hashed passphrase: from the hashed value of the passphrase, expressed as an hex-encoded string, calculate a byte array (32-bytes / 256-bits binary buffer); for instance, “4981AA…” becomes [49, 81, 170, …]. The expected value is the Base64 encoding of this byte array. Note that a base64 conversion is usually implicitly applied to byte arrays when converted to json structures.

danielweck commented 3 years ago

Ah yes, I have to update the console messages in Thorium :) https://readium.org/lcp-specs/notes/lcp-key-retrieval.html#the-lcp_hashed_passphrase-element

We've had this dual hex / base64+hex string handling code in Thorium for a while, because at the time the test OPDS feeds implemented different syntaxes. From memory, I'm not sure which ones used the correct base64+hex method.

danielweck commented 3 years ago

Encoding

NodeJS

const b64Str = Buffer.from(hexStr, "hex").toString("base64");

Decoding

NodeJS

const hexStr = Buffer.from(b64Str, "base64").toString("hex");

NodeJS example

(execute on the command line with node filename.js)


const crypto = require("crypto");
const pass = "LCP passphrase この世界の謎 \uD83D\uDE00"; // 0x1F600
const checkSum = crypto.createHash("sha256");
checkSum.update(pass);

// "ee06d6201c88d0e51c123cb0464a68bb8b0d4c6b0c413400320375c36b887759"
const hexStr = checkSum.digest("hex");

// ENCODE:

// "7gbWIByI0OUcEjywRkpou4sNTGsMQTQAMgN1w2uId1k="
const b64Str = Buffer.from(hexStr, "hex").toString("base64");
console.log(`--------- ENCODE:
"${pass}"
=> "${hexStr}"
=> "${b64Str}"`);

// DECODE:

// "ee06d6201c88d0e51c123cb0464a68bb8b0d4c6b0c413400320375c36b887759"
const hexStr_ = Buffer.from(b64Str, "base64").toString("hex");
console.log(`--------- DECODE:
"${b64Str}"
=> "${hexStr_}"`);
danielweck commented 3 years ago

Encoding

Javascript (Web Browser, not NodeJS)

const b64Str = self.btoa((new Uint8Array(hexStr.match(/.{1,2}/g).map(b => parseInt(b, 16)))).reduce((s, b) => s + String.fromCharCode(b), ""));

Decoding

Javascript (Web Browser, not NodeJS)

const hexStr = Array.from(self.atob(b64Str)).map(c => c.charCodeAt(0).toString(16).padStart(2, '0')).join('');

Javascript example (Web Browser, not NodeJS)

(copy+paste into Web Inspector to execute, see console log)

(async () => {
    const pass = "LCP passphrase この世界の謎 \uD83D\uDE00"; // 0x1F600
    const checkSumArrayBuffer = await crypto.subtle.digest('SHA-256', (new TextEncoder("utf-8")).encode(pass));

    // "ee06d6201c88d0e51c123cb0464a68bb8b0d4c6b0c413400320375c36b887759"
    const hexStr = Array.from(new Uint8Array(checkSumArrayBuffer)).map(b => b.toString(16).padStart(2, '0')).join('');

    // ENCODE:

    // "7gbWIByI0OUcEjywRkpou4sNTGsMQTQAMgN1w2uId1k="

    // Method 1 (from hex string)
    const b64Str1 = self.btoa((new Uint8Array(hexStr.match(/.{1,2}/g).map(b => parseInt(b, 16)))).reduce((s, b) => s + String.fromCharCode(b), ""));

    // Method 2 (from hex buffer, and simplified fromCharCode apply)
    const b64Str2 = self.btoa(String.fromCharCode.apply(null, new Uint8Array(checkSumArrayBuffer)));

    console.log(`--------- ENCODE:
    "${pass}"
    => "${hexStr}"
    => "${b64Str1}"
    => "${b64Str2}"`);

    // DECODE:

    // "ee06d6201c88d0e51c123cb0464a68bb8b0d4c6b0c413400320375c36b887759"
    const hexStr_ = Array.from(self.atob(b64Str1)).map(c => c.charCodeAt(0).toString(16).padStart(2, '0')).join('');
    console.log(`--------- DECODE:
    "${b64Str1}"
    => "${hexStr_}"`);
})();
mickael-menu commented 3 years ago

Thanks Daniel! These JavaScript code samples could be really useful to build a static HTML page to help validate an implementation with a dynamic form to decode/encode an input.

danielweck commented 3 years ago

I think that the key takeaway is that the base64 encoding layer applies to the hex buffer, not to the hex string (mistake easily made). So once this is clear, it really just boils down to using a suitable standard library or third-party APIs to convert the string / byte sequence. There are a few different ways of doing this on the Open Web Platform, if you know a simpler method please chime in :) (my current proposal is quite convoluted)

I need to update my NodeJS and WebJS examples with unicode characters and surrogate pairs (in the original non-hashed LCP passphrase), just to make sure I am converting the byte sequence correctly.

danielweck commented 3 years ago

FYI, I updated both the NodeJS and WebJS examples with unicode (i.e. Japanese characters and a smile emoji 0x1F600 represented by its surrogate pair): "LCP passphrase この世界の謎 \uD83D\uDE00"

danielweck commented 3 years ago

Here is an alternative Kotlin example, using java.util.Base64 instead of android.util.Base64:

import java.util.Base64

    val hexStr = Base64.getDecoder().decode(base64Str)
        .map { String.format("%02x", it) }
        .joinToString(separator = "")

Kotlin REPL:

https://pl.kotl.in/G_pMqv90L

import java.util.Base64

fun main() {

    val hexStr = Base64.getDecoder().decode("7gbWIByI0OUcEjywRkpou4sNTGsMQTQAMgN1w2uId1k=")
        .map { String.format("%02x", it) }
        .joinToString(separator = "")

    println(hexStr) // "ee06d6201c88d0e51c123cb0464a68bb8b0d4c6b0c413400320375c36b887759"
}
danielweck commented 3 years ago

Swift REPL(it):

https://replit.com/@danielweck/LCPPassBase64Hex

import Foundation

let hexStr = Data(base64Encoded: "7gbWIByI0OUcEjywRkpou4sNTGsMQTQAMgN1w2uId1k=")
    .map { [UInt8]($0) }?
    .map { String(format: "%02x", $0) }
    .joined()

print(hexStr) // "ee06d6201c88d0e51c123cb0464a68bb8b0d4c6b0c413400320375c36b887759"
danielweck commented 3 years ago

PHP REPL(it):

https://replit.com/@danielweck/LCPPassBase64HexPHP

<?php
$hexStr = base64_encode(hash('sha256', "LCP passphrase この世界の謎 😀", true));
echo $hexStr; // "7gbWIByI0OUcEjywRkpou4sNTGsMQTQAMgN1w2uId1k="
danielweck commented 3 years ago

A somewhat related issue: https://github.com/readium/lcp-specs/issues/52