usnistgov / h5wasm

A WebAssembly HDF5 reader/writer library
Other
86 stars 12 forks source link

Feature request: support network-based filesystem support for large files #4

Open bmaranville opened 2 years ago

bmaranville commented 2 years ago

For files too large to load into memory, it would be nice to be able to load parts of a file on-demand over the network (see discussion in #2). The new WASM Filesystem being designed for emscripten seems like it is anticipating this use case: see design documents at https://github.com/emscripten-core/emscripten/issues/15041

When this is implemented and generally available, look into adding this capability to h5wasm so that very large files can be retrieved piecewise and on-demand by URL.

bmaranville commented 2 years ago

It appears that this feature already exists in emscripten: https://emscripten.org/docs/porting/files/Synchronous-Virtual-XHR-Backed-File-System-Usage.html

Prerequisites:

Then worker code like this can function:

import * as hdf5 from "../h5wasm/dist/esm/hdf5_hl.js";

var file;
const DEMO_FILEPATH="https://ncnr.nist.gov/pub/ncnrdata/ngbsans/202009/nonims294/data/sans114140.nxs.ngb?gzip=false";

self.onmessage = async function (event) {
    const { action, payload } = event.data;
    if (action === "load") {
        await hdf5.ready;
        hdf5.FS.createLazyFile('/', "current.h5", DEMO_FILEPATH, true, false);
        file = new hdf5.File("current.h5");
    }
    else if (action === "get") {
        await hdf5.ready;
        if (file) {
            self.postMessage(file.get("entry").attrs["NX_class"].value)
        }
    }
  };
cavenel commented 1 year ago

I needed the exact same feature, and this solution works really well! But if I understand correctly, modules inside workers have never been implemented into Firefox (https://bugzilla.mozilla.org/show_bug.cgi?id=1360870). Is there a solution to convert the module code into normal JS code for Firefox? I am a bit stuck here...

bmaranville commented 1 year ago

I haven't been providing an IIFE "plain javascript" build because it's easier to support ESM outputs, but you can get a version of the library that will work

if you do a little bit of post-processing on the distributed library: First, install esbuild, then compile the ESM outputs to IIFE:

npm i esbuild
npx esbuild --bundle dist/esm/hdf5_hl.js --outfile=h5wasm_iife.js --format=iife --global-name=h5wasm

The resulting file can then be loaded in (any!) worker with

importScripts("h5wasm_iife.js");

In fact, it's so easy I should probably include that build in a future release!

cavenel commented 1 year ago

That was indeed super easy and works like a charm. ❤️ (I hope Firefox gets modules in workers soon though, it would make life simpler for many)

Thank you so much for the prompt answer!! (And I guess this issue could be closed as everything works as intended)