HDF-NI / hdf5.node

A node module for reading/writing the HDF5 file format.
MIT License
123 stars 40 forks source link

Support for Single Writer Multiple Reader (SWMR)? #100

Open trikunai opened 5 years ago

trikunai commented 5 years ago

Hi i would like to know if this package support the Single Writer Multiple Reader (SWMR) in the same way as http://docs.h5py.org/en/stable/swmr.html

The SWMR features allow simple concurrent reading of a HDF5 file while it is being written from another process. Prior to this feature addition it was not possible to do this as the file data and meta-data would not be synchronised and attempts to read a file which was open for writing would fail or result in garbage data.

A file which is being written to in SWMR mode is guaranteed to always be in a valid (non-corrupt) state for reading. This has the added benefit of leaving a file in a valid state even if the writing application crashes before closing the file properly.

This feature has been implemented to work with independent writer and reader processes. No synchronisation is required between processes and it is up to the user to implement either a file polling mechanism, inotify or any other IPC mechanism to notify when data has been written.

The SWMR functionality requires use of the latest HDF5 file format: v110. In practice this implies using at least HDF5 1.10 (this can be checked via h5py.info) and setting the libver bounding to “latest” when opening or creating the file.

If is not included... is it planned to be developed?

Thanks

chadbr commented 5 years ago

Good timing! I was about to ask the same...

rimmartin commented 5 years ago

Hi,

http://hdf-ni.github.io/hdf5.node/ref/file.html File.prototype.enableSingleWriteMultiRead() is to turn it on. I don't have a good test case for it; we test and see if it works

KirmTwinty commented 3 years ago

Apparently, it doesn't work, not implemented.

Since I needed it for a professional project, I have made a fork of the v12 branch in my repository, implementing an SWMR Reader only:

v12+SWMR branch @KirmTwinty

You need to change a bit your code though since the creation of a new object Dataset that behaves like a Group object was needed. Check out the About this fork section in the readme.md

rimmartin commented 3 years ago

Hi @KirmTwinty , thank for the example.

I'm working it in to the master branch now that v12 branch is merged. Probably will work in the swmr write as well and make it work from nodejs

KirmTwinty commented 3 years ago

Great news, thanks! I actually applied the Group and File reading principles to Dataset.

The good thing is that the library will principally rely on the bare H5 library (which could be done for any reading and writing mode). The different calls to the H5LT one could be removed?

Do not hesitate if I can help.

rimmartin commented 3 years ago

Hi @KirmTwinty, Thank you

The H5LT was the beginning; thought it would get the furtherest at the time. Internally h5_lt.hpp has been relying more and more on the bare h5. Tables, packet tables and image APIs will still need the lite(I don't have an energetic reason to rewrite those).

Mostly want to keep the javascript side steady for users with potential additions. The SWMR is probably the most useful new addition to users out there. The idea is not to replicate every API function of hdf5 but map the hierarchical nature of the data and attributes to the nature of js without the user needing to learn a c style API replication but get to their data/attributes the way js already is designed. Have to think some more.

KirmTwinty commented 3 years ago

I totally understand! Thank you for coming with this.

My intend was to suggest a potential performance improvement, not to criticize :)

I see your point for trying to keep JS philosophy while interfacing with a C API, not such an easy task.