Open bmaranville opened 2 years ago
Note that for local files, the emscripten WORKERFS interface could be used to get random access to huge local files from a worker without copying the whole file into memory, which is another benefit of moving the provider to a worker.
Note that it could potentially be refactored to a service worker that uses the same API as a grove server, if that simplifies things.
This would be brilliant! However, it seems that synchronous XHR requests inside Service Workers are currently not supported in Chrome and Safari—only in Firefox.
Does the recent work on the H5wasmLocalFileProvider
#1604 perhaps provide a pathway for something similar to be implemented with Range
request headers in URLs?
This would be of huge benefit to our use case, where multi-gigabyte files are stored remotely and loading the entire file is both memory and network prohibitive.
It's definitely going to help. @bmaranville also developed a lazyFileLRU
demo to show feasibility. However, the amount of code required and its complexity has me worried a bit; it's not going to be trivial to make a production service out of this. I need to look into it more to better understand what's going on.
Is your feature request related to a problem?
Reading very large files with the h5wasm provider is not possible, for several reasons:
Requested solution or feature
For web file servers with HDF5/NeXus files that support range requests, on-demand loading could enable access to very large NeXus files that would be infeasible to read as a whole, using emscripten's
lazyFile
functionalityAlternatives you've considered
HSDS and grove providers already allow this type of random access to parts of a NeXus file.
Additional context
Because sync file access is required, this might require refactoring the h5wasm provider to operate from a worker. Note that it could potentially be refactored to a service worker that uses the same API as a grove server, if that simplifies things.