This repository contains thin high-level bindings for the [[https://www.hdfgroup.org/HDF5/][HDF5 data format]] for the Nim programming language. It also provides a wrapper of the full HDF5 C library, importable using:
import nimhdf5/hdf5_wrapper
The raw wrapper dynamically links the libhdf5.so (the main library) and libhdf5_hl.so (a library containing high-level convenience functions) libraries at runtime. All public functions of the two libraries are callable by their corresponding C names and C arguments. That means the Nim datatypes need to be manually cast to the corresponding compatible types, if only the wrapper is used.
The high-level bindings, while covering most general HDF5 features, are still in a rough state, due to limited testing by actual users. Most features should work fine (at least using linux), but some known bugs are still there. See [[file:examples/h5_high_level_example.nim]] as an overview (soon to be cleaned up, split and put into a tutorial form) on the available features and their usage. Also take a look at the tests for simple examples of specific features. For a more indepth example of usage of this library, take a look at:
The wrapper was built making heavy use of [[https://www.github.com/nim-lang/c2nim][c2nim]] and the main goal was to have a usable interface for the HDF5 data format. More advanced features (e.g. single writer / multiple reader) were of lower priority. Via the wrapper all features should in principle work, but many have not been tested.
** Compatibility
The wrapper is currently tested using Nim version =0.18.0= and the current devel branch (=0.18.1=).
The wrapper is built from HDF5 version =1.10.1=.
Linking against the HDF5 =1.8= library is reasonably supported as well, but requires to use an additional compiler flag for now:
-d:H5_LEGACY
With the recent release of version =1.10.3= / =1.10.4=, support for this version currently is available under the
-d:H5_FUTURE
flag. Soon this will become the default version! The =H5Oget/visit_...2= procedures are wrapped under a name without a =2= suffix. These however add a =fields= argument. Therefore an overload is available, which maps the =fields= to =H5O_INFO_ALL=.
*** Troubleshooting If you compile a nim program using =nimhdf5= without any of those flags and try to run it on a system with a HDF5 shared library of version =1.10.3= or newer, you will be greeted by:
could not import: H5Oget_info
In that case, add the =-d:H5_FUTURE= flag to the compilation command (or probably add it to your =nim.cfg= or =config.nims= of the project).
On the other hand if you try to run such a compiled binary on a system with a HDF5 library of version =1.8=, you will probably see:
could not import: H5P_LST_FILE_CREATE_g
add the =-d:H5_LEGACY= flag.
In case neither of these work, please open an issue!
*** Version checks in the wrapper Currently no checks are done, which compare the library this wrapper is built upon with the library linking against using some of the provided HDF5 macros (e.g. H5check, H5get_libversion etc.). The main reason is explained below. However, as far as the high-level functionality is concerned at the moment, the only differences arise in a few constant definitions, whose names slightly changed from =1.8= to =1.10=. This is what the compiler flags sets accordingly.
The HDF5 headers contain macros for many variables, such as
where
and
i.e. it makes use of C's comma operator. However, c2nim currently [[https://nim-lang.org/docs/c2nim.html#limitations][has no support for it]]. Instead of porting them in some reasonable way, these macros were converted to simple replacements with the values, dropping the calls to H5check() and H5open().
The call to H5check() is currently not used at all. Compiling a Nim program with this wrapper (based on version =1.10.1=) would normally fail to check against the linked library, if that version is different.
As H5open() is important, the calls are replaced by a single call to initialize the library at the beginning upon the first call of the library via [[file:src/nimhdf5/wrapper/H5niminitialize.nim]].
As HDF5 is a very macro heavy library, other important macros may not have been correctly wrapped to Nim, e.g. determination of correct sizes of data types. This may cause some weird side-effects (to be fair, I haven't noticed any!).
Additionally, Windows support is unknown at this time. The library name is correctly set for Windows, however an additional header file =H5FDwindows.h= might have to be wraped.
** Installation
Installation can either be done via nimble:
nimble install nimhdf5
or manually by cloning this git repository:
git clone https://github.com/vindaar/nimhdf5
in a folder of your choice and call nimble install afterwards:
cd nimhdf5 nimble install
Or simply make use of nimble's Github interfacing capabilities:
nimble install https://github.com/vindaar/nimhdf5
** Files
The folder [[file:c_headers/][c_headers]] contains the modified HDF5 headers in the state they were in for a successful c2nim conversion. In some cases the C header file had to be modified, in others modification to the resulting .nim file was still necessary.
The folder [[file:examples/][examples]] contains the basic HDF5 C examples (see here: [[https://support.hdfgroup.org/HDF5/examples/intro.html#c]]) converted to Nim utilizing the wrapper.
[[file:examples/h5_high_level_example.nim][h5_high_level_example.nim]] serves as a replacement for a tutorial for now (tutorial will be added soon!), showcasing (almost) all available features and their usage.
** Known bugs and quirks
The high level bindings come with several quirks which are good to know.
** Implemented HDF5 features
*** Single Writer Multiple Reader
This wrapper fully supports the Single Writer Multiple Reader feature of the HDF5 library, but it is still in an experimental state, as I've never really needed it.
It allows to access a single HDF5 file from multiple threads or processes, where one of these is a writer process and all others are readers. When using this feature the user does not have to worry about locks etc. between the different processes.
**** Usage Open an HDF5 file in write mode and hand the =swmr= flag:
import nimhdf5 var h5f = H5open("/tmp/test.h5", "rw", swmr = true)
and in all reader threads / processes, simply do the same, but do not hand a write:
import nimhdf5 var h5f = H5open("/tmp/test.h5", "r", swmr = true)
This should be all that is required.
I'm not sure if the writer process should make sure to flush the file regularly or not. Feel free to tell me if you know. :)
***** Alternative writer
An alternative to the above for the writer process is to first open the file in write mode without ~swmr = true~ and then later put it into =swmr= mode via:
import nimhdf5 var h5f = H5open("/tmp/test.h5", "rw")
h5f.activateSWMR()
*** Threadsafe HDF5 library & file locks
This wrapper can also be used with a HDF5 library that was compiled with the =--enable-threadsafe= compilation flag.
Once the library has been compiled with it, in principle the user can try to open a single file in write mode from multiple processes or threads. For safe handling in these contexts, it may be up to the user to lock access to the file / writing to individual datasets via some locking mechanism. In principle the threadsafe option of the library adds its own mutex logic, so in theory it should work without them.
See these notes about the threadsafe library: https://support.hdfgroup.org/HDF5/doc/TechNotes/ThreadSafeLibrary.html
One issue a user might encounter is that the second opening of a file yields an error saying that the resource is temporarily unavailable. In HDF5 version starting from =1.10=, file locking was added as a feature.
This behavior is controlled via an environment variable:
export HDF5_USE_FILE_LOCKING=FALSE
If set to false, file locking is disabled. With it multiple processes may open the same file.
When doing this, keep in mind that each thread / process will receive their own =FileID=. Some HDF5 functions may either give the user information based on the specific file ID and others based on the actual file. In the cases where one can choose, it is supported via the =okLocal= (=ObjectKind= enum) or =fkLocal= (=FlushKind= enum).
Relevant part of the documentation: https://docs.hdfgroup.org/hdf5/develop/_h5public_8h.html#title31
** Blosc support
To use blosc as a filter you need to import:
import nimhdf5/blosc
Before =v0.3.12= this was done automatically if the =nblosc= library is installed.