HDF-NI / hdf5.node

A node module for reading/writing the HDF5 file format.
MIT License
123 stars 40 forks source link

hdf5 javascript in a webbrowser #29

Closed iimog closed 2 years ago

iimog commented 8 years ago

Is it possible to adjust this code to run in a browser? I want to write a javascript module for my web application that allows handling of hdf5 data. Thus I can not directly interact with files. Instead the data would come from a webserver in raw binary format interpreted as typed array (https://developer.mozilla.org/en-US/docs/Web/JavaScript/Typed_arrays). I want to allow the user to interact with the data, modify it and send it back (still as raw binary HDF5 data).

I see that this is not the scope of your node module but I still hope you have an idea on how to do that. Thanks in advance.

vincent-tr commented 8 years ago

Hi,

This code is based on a native hdf5 library, written in C. Thus, I think it can not run in a browser.

iimog commented 8 years ago

Hi, thanks for the very quick reply. I feared as much. So I will look for alternatives. Anyway, thanks again.

rimmartin commented 8 years ago

I've been sending hdf5 dataset binary data to browser with web socket API and with binaryjs [which I'm phasing out]. Putting into a second project under this organization. As typed arrays and buffers

rimmartin commented 8 years ago

Also sending hdf5 images as buffers to browser canvas

rimmartin commented 8 years ago

What has slowed putting the hdf5.ws layer API back up is the new javascript language spec coming along. Been experimenting with Babel transpiling and tryng to stand up code with Chakra core. Am studying javascript languae spec to learn what future code will be. The viewer/editor layer is koa based and some of it is now based on babel transpiling.

Also V8 could eventually catch up to the new specs and then nodejs projects will not need transpiling.

The babel documents haven't caught up to its own version 6.0.0 release from 6 months ago. Also I'm in a learning curve for pretranspiling for browser client javascript ES2015.

iimog commented 8 years ago

@rimmartin thanks for the additional information. I will follow your updates closely. In case I make some progress I will let you know. I will experiment with emscripten to try and compile the native C library to javascript.

iimog commented 8 years ago

I was able to compile the hdf5 library with emscripten (with some difficulties). I'm also able to compile and link example hdf5 programs in c to javascript (both for nodejs and browsers). However hdf5 does not play nicely with the fake file system provided by emscripten. It just refuses to read or write any files (even preloaded ones) with error messages like that:

HDF5-DIAG: Error detected in HDF5 (1.8.17) thread 0:
  #000: ../../src/H5T.c line 4508 in H5T_path_find(): unable to initialize conversion function
    major: Datatype
    minor: Unable to initialize object
  #001: ../../src/H5Tconv.c line 7713 in H5T__conv_long_float(): disagreement about datatype size
    major: Datatype
    minor: Unable to initialize object
  #002: ../../src/H5T.c line 2290 in H5T_register(): unable to locate/allocate conversion path
    major: Datatype
    minor: Unable to initialize object
...

Currently I see no way of working around this as the hdf5 library depends heavily on files.

rimmartin commented 8 years ago

Yea, it heavily tied to the file system. Also hdf5 1.10 gets even more involved.

I'll be putting up the other projects in this organization in the next evenings

rimmartin commented 8 years ago

Now making a double dataset from browser HDF5 interface. Preparing to read it back https://github.com/HDF-NI/hdf5.ws is where this layer is being repo'ed.

Dependencies are just co, ws and hdf5 nodejs modules

iimog commented 8 years ago

@rimmartin thanks for sharing. I will definitely try it out. Thanks for all the help.

rimmartin commented 8 years ago

It's not complete at all; I'll get the read working tonight

What data types are you first working with?

iimog commented 8 years ago

I wanted to write a module to handle biom format version 2. For now I will probably stick to version 1 which is based on JSON. But I will keep an eye on your work to move to version 2 as soon as possible.

The biom specification contains the data types: H5T_IEEE_F64LE, H5T_STD_I32LE, H5T_STD_I64LE, H5T_STRING

rimmartin commented 8 years ago

Ok, I'll look at and work toward supporting that format

rimmartin commented 8 years ago

Testing with https://github.com/biocore/biom-format/blob/master/examples/rich_sparse_otu_table_hdf5.biom

bom_view Able to look at a dataset. Now biom uses deflate so for writing back I'm adding ohoosing compression level.

Is rich_sparse_otu_table_hdf5.biom typical of your data?

How large a datasets do you reach? I'll be adding chunking and subregion io.

iimog commented 8 years ago

Wow, that looks really promising! This biom file is typical in structure but not in size. The biom files I'm interested in are a few (~5) MB in size. They contain about 100000 entries in sample > matrix > data Great work!

bmaranville commented 3 years ago

I was able to compile the hdf5 library with emscripten (with some difficulties). I'm also able to compile and link example hdf5 programs in c to javascript (both for nodejs and browsers). However hdf5 does not play nicely with the fake file system provided by emscripten. It just refuses to read or write any files (even preloaded ones) with error messages like that:

HDF5-DIAG: Error detected in HDF5 (1.8.17) thread 0:
  #000: ../../src/H5T.c line 4508 in H5T_path_find(): unable to initialize conversion function
    major: Datatype
    minor: Unable to initialize object
  #001: ../../src/H5Tconv.c line 7713 in H5T__conv_long_float(): disagreement about datatype size
    major: Datatype
    minor: Unable to initialize object
  #002: ../../src/H5T.c line 2290 in H5T_register(): unable to locate/allocate conversion path
    major: Datatype
    minor: Unable to initialize object
...

Currently I see no way of working around this as the hdf5 library depends heavily on files.

Working with the compile script provided by https://github.com/aertslab/webhdf5 I was able to load HDF5 files from the Emscripten FileSystem API. I was able to write files too. With a little adjustment I even got the zlib filter working. The error message you are showing is coming from some of the byte-conversions not working for 32-bit Webassembly, I think. The library seems to work in spite of those ominous warnings.

I was able to use "H5Fopen" without difficulty (make sure to compile with "-s WASM_BIGINT" so you can pass hid_t values back and forth to javascript).

bmaranville commented 2 years ago

A wasm/js library built on HDF5 1.12.1 is available now at https://github.com/usnistgov/h5wasm - it should be able to load all the datatypes you mentioned into native js types (Float64Array, Int32Array, BigInt64Array and Array[String])

iimog commented 2 years ago

Thanks @bmaranville I had a look and this looks perfect. I hope to find time to switch my library to use it in the next couple of weeks. Anyway, this resolves my issue as far as this repository is concerned.