openpgpjs / web-stream-tools

Convenience functions for reading, transforming and working with WhatWG Streams
MIT License
27 stars 15 forks source link

Configure Reader stream chunk size of 64kB #6

Closed drzraf closed 4 years ago

drzraf commented 4 years ago
// <input type="file" multiple>
document.querySelector("input[type=file]").addEventListener('change', encrypt, false);

function encrypt(e) {
    var file = e.target.files[0];
    var { message } = await openpgp.encrypt({
        message: openpgp.message.fromBinary(file.stream()),
        publicKeys: [KEYS],
        armor: false
    }),
        _reader = openpgp.stream.getReader(message.packets.write()),
        total = 0;
    console.time("counter");
    while (true) {
        const  { value, done } = await _reader.read();
        if (value) {
            total += value.length;
            console.log("Read %d bytes", value.length);
        }
        if (done) break;
    }
    console.timeEnd("counter");
    console.log("Total bytes read: %d", total);
}

Output when encrypting a 8MB file:

Read 3 bytes
Read 268 bytes
Read 3 bytes
Read 524 bytes
Read 1 bytes
Read 65537 bytes          (128 times)
Read 185 bytes
counter: 418 ms
Total bytes read: 8389720

Seems that a 64kB + 1 size is set by default. How to configure this chunk size?

(btw I'm happy openpgp.js could encrypt 512MB in ~ 22 seconds using less than 11 MB of memory)

twiss commented 4 years ago

Hey :wave: This chunk size comes from the browser's implementation of file.stream(), not from this library. See https://github.com/w3c/FileAPI/issues/144.

However, if you want a specific chunk size, you can use this library to do so. See this example in the readme to get a chunk size of 1024 bytes (just remove encryptor.process and write chunk as-is). You can either do this before encrypting (if you want it to encrypt n bytes at a time) or after encrypting (if you want all chunks to be n bytes).

Alternatively, if you just want to read the encrypted data 1MB at a time, for example, you can simply do:

    while (true) {
      const chunk = await _reader.readBytes(1000000);
      if (chunk === undefined) {
        break;
      }
      console.log(chunk);
    }
drzraf commented 4 years ago

Great! readBytes fits! I just found worrisome to see it implemented as a simple read() in a while loop when I expected a higher-level slice-like functions (supposedly more efficient?). Anyway I found performance not to be a problem.

Interesting fact: with openpgp.enums.compression.zlib set, read() size drops from 64k down to 513 bytes. Example of 256MB of 1 (compressed to 354kB):

Read 3 bytes
Read 268 bytes
Read 3 bytes
Read 524 bytes
Read 1 bytes
Read 513 bytes       (492 times)
Read 1025 bytes
Read 513 bytes       (211 times)
Read 280 bytes
counter: 8942 ms
Total bytes read: 362743

(Using 4 x readBytes(100000) make it in 7 seconds)

Thank you!