jamesmudd / jhdf

A pure Java HDF5 library
http://jhdf.io
MIT License
127 stars 35 forks source link
bigdata file-format hdf5 java

jHDF - Pure Java HDF5 library

jHDF CI Coverage Maven Central Javadocs JetBrains Supported DOI

This project is a pure Java implementation for accessing HDF5 files. It is written from the file format specification and is not using any HDF Group code, it is not a wrapper around the C libraries. The file format specification is available from the HDF Group here. More information on the format is available on Wikipedia. I presented a webinar about jHDF for the HDF Group which is available on YouTube the example code used and slides can be found here.

The intention is to make a clean Java API to access HDF5 data. Currently, reading is very well-supported and writing supports limited use cases. For progress see the change log. Java 8, 11, 17 and 21 are officially supported.

Here is an example of reading a dataset with jHDF (see ReadDataset.java)

try (HdfFile hdfFile = new HdfFile(Paths.get("/path/to/file.hdf5"))) {
    Dataset dataset = hdfFile.getDatasetByPath("/path/to/dataset");
    // data will be a Java array with the dimensions of the HDF5 dataset
    Object data = dataset.getData();
}

For an example of traversing the tree inside a HDF5 file see PrintTree.java.

An example of writing a file jhdf.hdf5 containing a group group, with two datasets ints and doubles

try (WritableHdfFile hdfFile = HdfFile.write(Paths.get("jhdf.hdf5"))) {
    WritableGroup group = hdfFile.putGroup("group");
    group.putDataset("ints", new int[] {1, 2, 3, 4});
    group.putDataset("doubles", new double[] {1.0, 2.0, 3.0, 4.0});
}

See WriteHdf5.java for a more extensive complete example. Note: writing files is still a early feature with many more functions to be added.

For more examples see package io.jhdf.examples

Why should I use jHDF?

Why should I not use jHDF?

Why did I start jHDF?

Mostly it's a challenge, HDF5 is a fairly complex file format with lots of flexibility, writing a library to access it is interesting. Also, as a widely used file format for storing scientific, engineering, and commercial data, it would seem like a good idea to be able to access HDF5 files with more than one library. In particular JVM languages are among the most widely used so having a native HDF5 implementation seems useful.

Developing jHDF

To see other available Gradle tasks run ./gradlew tasks

If you have read this far please consider staring this repo. If you are using jHDF in a commercial product please consider making a donation. Thanks!