Closed iimog closed 3 years ago
Hi,
This code is based on a native hdf5 library, written in C. Thus, I think it can not run in a browser.
Hi, thanks for the very quick reply. I feared as much. So I will look for alternatives. Anyway, thanks again.
I've been sending hdf5 dataset binary data to browser with web socket API and with binaryjs [which I'm phasing out]. Putting into a second project under this organization. As typed arrays and buffers
Also sending hdf5 images as buffers to browser canvas
What has slowed putting the hdf5.ws layer API back up is the new javascript language spec coming along. Been experimenting with Babel transpiling and tryng to stand up code with Chakra core. Am studying javascript languae spec to learn what future code will be. The viewer/editor layer is koa based and some of it is now based on babel transpiling.
Also V8 could eventually catch up to the new specs and then nodejs projects will not need transpiling.
The babel documents haven't caught up to its own version 6.0.0 release from 6 months ago. Also I'm in a learning curve for pretranspiling for browser client javascript ES2015.
@rimmartin thanks for the additional information. I will follow your updates closely. In case I make some progress I will let you know. I will experiment with emscripten to try and compile the native C library to javascript.
I was able to compile the hdf5 library with emscripten (with some difficulties). I'm also able to compile and link example hdf5 programs in c to javascript (both for nodejs and browsers). However hdf5 does not play nicely with the fake file system provided by emscripten. It just refuses to read or write any files (even preloaded ones) with error messages like that:
HDF5-DIAG: Error detected in HDF5 (1.8.17) thread 0:
#000: ../../src/H5T.c line 4508 in H5T_path_find(): unable to initialize conversion function
major: Datatype
minor: Unable to initialize object
#001: ../../src/H5Tconv.c line 7713 in H5T__conv_long_float(): disagreement about datatype size
major: Datatype
minor: Unable to initialize object
#002: ../../src/H5T.c line 2290 in H5T_register(): unable to locate/allocate conversion path
major: Datatype
minor: Unable to initialize object
...
Currently I see no way of working around this as the hdf5 library depends heavily on files.
Yea, it heavily tied to the file system. Also hdf5 1.10 gets even more involved.
I'll be putting up the other projects in this organization in the next evenings
Now making a double dataset from browser HDF5 interface. Preparing to read it back https://github.com/HDF-NI/hdf5.ws is where this layer is being repo'ed.
Dependencies are just co, ws and hdf5 nodejs modules
@rimmartin thanks for sharing. I will definitely try it out. Thanks for all the help.
It's not complete at all; I'll get the read working tonight
What data types are you first working with?
I wanted to write a module to handle biom format version 2. For now I will probably stick to version 1 which is based on JSON. But I will keep an eye on your work to move to version 2 as soon as possible.
The biom specification contains the data types:
H5T_IEEE_F64LE, H5T_STD_I32LE, H5T_STD_I64LE, H5T_STRING
Ok, I'll look at and work toward supporting that format
Testing with https://github.com/biocore/biom-format/blob/master/examples/rich_sparse_otu_table_hdf5.biom
Able to look at a dataset. Now biom uses deflate so for writing back I'm adding ohoosing compression level.
Is rich_sparse_otu_table_hdf5.biom typical of your data?
How large a datasets do you reach? I'll be adding chunking and subregion io.
Wow, that looks really promising! This biom file is typical in structure but not in size. The biom files I'm interested in are a few (~5) MB in size. They contain about 100000 entries in sample > matrix > data
Great work!
I was able to compile the hdf5 library with emscripten (with some difficulties). I'm also able to compile and link example hdf5 programs in c to javascript (both for nodejs and browsers). However hdf5 does not play nicely with the fake file system provided by emscripten. It just refuses to read or write any files (even preloaded ones) with error messages like that:
HDF5-DIAG: Error detected in HDF5 (1.8.17) thread 0: #000: ../../src/H5T.c line 4508 in H5T_path_find(): unable to initialize conversion function major: Datatype minor: Unable to initialize object #001: ../../src/H5Tconv.c line 7713 in H5T__conv_long_float(): disagreement about datatype size major: Datatype minor: Unable to initialize object #002: ../../src/H5T.c line 2290 in H5T_register(): unable to locate/allocate conversion path major: Datatype minor: Unable to initialize object ...
Currently I see no way of working around this as the hdf5 library depends heavily on files.
Working with the compile script provided by https://github.com/aertslab/webhdf5 I was able to load HDF5 files from the Emscripten FileSystem API. I was able to write files too. With a little adjustment I even got the zlib filter working. The error message you are showing is coming from some of the byte-conversions not working for 32-bit Webassembly, I think. The library seems to work in spite of those ominous warnings.
I was able to use "H5Fopen" without difficulty (make sure to compile with "-s WASM_BIGINT" so you can pass hid_t values back and forth to javascript).
A wasm/js library built on HDF5 1.12.1 is available now at https://github.com/usnistgov/h5wasm - it should be able to load all the datatypes you mentioned into native js types (Float64Array, Int32Array, BigInt64Array and Array[String])
Thanks @bmaranville I had a look and this looks perfect. I hope to find time to switch my library to use it in the next couple of weeks. Anyway, this resolves my issue as far as this repository is concerned.
Is it possible to adjust this code to run in a browser? I want to write a javascript module for my web application that allows handling of hdf5 data. Thus I can not directly interact with files. Instead the data would come from a webserver in raw binary format interpreted as typed array (https://developer.mozilla.org/en-US/docs/Web/JavaScript/Typed_arrays). I want to allow the user to interact with the data, modify it and send it back (still as raw binary HDF5 data).
I see that this is not the scope of your node module but I still hope you have an idea on how to do that. Thanks in advance.