garrettmflynn / webnwb

A JavaScript API for working with neurodata stored in the NWB Format
https://brainsatplay.com/webnwb/tutorial
GNU Affero General Public License v3.0
5 stars 1 forks source link

Inefficient Write Access #2

Open garrettmflynn opened 1 year ago

garrettmflynn commented 1 year ago

Description of the bug

As of v0.1.0, there remain several problems with WebNWB's file writing solution.

  1. Large (1MB+) files encounter a memory overload error from h5wasm
  2. We cannot update existing properties on h5wasm, so hdf5-io is forced to completely recompile the object into an HDF5 file regardless of the number of changes to that file.
  3. Accessing file properties requires them to go through many (likely unnecessary) preprocessing steps before they are provided to the user. This results in very slow writing operations, such that a 1MB file may take several seconds to complete writing—or error as stated in (1).

Ideally, we can identify/negotiate a file access mode in h5wasm that allows updating existing properties. This would avoid (2) and (3) since properties are not accessed unless requested by the user. This may also clear up (1) since fewer h5wasm operations will bring file data into memory.

Isolated tests on the hdf5-io will likely be the best approach to fix (1) and (2). Notably, no tests—besides the initial fetching of a remote Ferguson file—are present on existing files like that. Such an implementation could uncover additional complications.

However, (3) is definitely an issue with WebNWB itself and/or esconform, which enforces the NWB Schema and the specification property provided in NWB files.

In summary, we should:

  1. Ask the maintainer of h5wasm about updating existing file properties
  2. Implement more write tests on existing NWB files in the hdf5-io repository.
  3. Minimize the number of preprocessing steps triggered when object properties are accessed on NWBFile instances in WebNWB

Steps To Reproduce

  1. Open the Files demo using yarn start
  2. Load a local NWB file of greater than 1MB. We are using downloaded files from the Ferguson et al. 2015 dataset.
  3. Press the save button.

Additional Information

N/A for now