digitalbazaar / forge

A native implementation of TLS in Javascript and tools to write crypto-based and network-heavy webapps
https://digitalbazaar.com/
Other
5.08k stars 785 forks source link

computing the sha256 hash of a File Blob in Node #1006

Open wvhulle opened 2 years ago

wvhulle commented 2 years ago

Hi, I am trying to compute the sha256 hash of a File object (which is a Blob object) in SvelteKit, but the hash does not correspond to the hash computed with the Linux command-line.

I use SvelteKit POST endpoints, but the following should be possible in other Javascript frameworks as well.

export const actions = {
    upload: async ({ request }) => {
        const formData = await request.formData()
        const file = await formData.get(`csvFile`) as File;
        const contents = await file.text()

        const md = forge.md.sha256.create();
        md.update(contents);
        const sha256 = md.digest().toHex()
    }
}

I cannot use new FileReader() here because I am in a Node environment and the file does not exist locally so I cannot use fs. How can I create the exact same hash server-side as computed client-side from the command-line? Do I need to add the filename or something to obtain the same hash as from the command line?

Thanks in advance!

See also https://developer.mozilla.org/en-US/docs/Web/API/File

davidlehn commented 2 years ago

By command line do you mean using sha256sum or similar?

It could be an encoding problem. You might need to use md.update(contents, 'utf8'). You could compare hex values that are being used for computations. Add console.log(forge.util.bytesToHex(contents)) and compare with hd file or xxd file or whatever hex tool you like.

Remember, runnable code with sample data in an issue makes it easier to help.

wvhulle commented 2 years ago

Thanks, indeed i used sha256sum to compute 27b147dfb06fee6e07ff26.... I have now used xxd on the file and the first line should read 7365 703d 5c74 0909 0909 0909 0909 0909.... On the client side i use:

var reader = new FileReader();
reader.onload = function (event) {
    var binary = event.target.result;
    var md = forge.md.sha256.create();
    console.log(
        `forge.util.bytesToHex(binary) = ${forge.util.bytesToHex(binary).slice(0, 10)}... `
    );
    sha256 = md.update(binary).digest().toHex();
};
reader.readAsBinaryString(f);

Here forge.util.bytesToHex(binary).slice(0, 10) is correct, matches with xxd and the checksum is correct. On the server-side I have done something similar and the hex of the input is correct. Just the checksum is incorrect:

const file = await formData.get(`csvFile`) as File;
const ab = await file.arrayBuffer()
console.log(`forge.util.bytesToHex(ab) = ${forge.util.bytesToHex(ab).slice(0, 10)}... `)
const md = forge.md.sha256.create();
md.update(ab, 'utf8');
sha256 = md.digest().toHex()

Adding utf8 just gives another incorrect checksum. I would want to give better example code, but as the server is a complete webserver it is a bit difficult.

wvhulle commented 2 years ago

In the end, I had to do an extra mapping on the server:

const blob = formData.get('csvFile');
const ab = await blob.arrayBuffer()
const b = Buffer.from(ab);
const contents = [...new Uint8Array(b)]
            .map((b) => String.fromCharCode(b))
            .join('');
const md = forge.md.sha256.create();
md.update(contents);
const sha256 = md.digest().toHex()