aldanor / hdf5-rust

HDF5 for Rust
https://docs.rs/hdf5
Apache License 2.0
307 stars 82 forks source link

Question: how can I select a field from h5 dataset #266

Open WindSoilder opened 8 months ago

WindSoilder commented 8 months ago

Background code

I have adjusted an example to the following to generate a h5 file which contains a 1d array:

use hdf5::{File, H5Type, Result};
use ndarray::{arr1, s};

#[derive(H5Type, Clone, PartialEq, Debug)] // register with HDF5
#[repr(u8)]
pub enum Color {
    R = 1,
    G = 2,
    B = 3,
}

#[derive(H5Type, Clone, PartialEq, Debug)] // register with HDF5
#[repr(C)]
pub struct Pixel {
    x: i64,
    y: i64,
    color: Color,
}

impl Pixel {
    pub fn new(x: i64, y: i64, color: Color) -> Self {
        Self { x, y, color }
    }
}

fn write_hdf5() -> Result<()> {
    use Color::*;
    let file = File::create("pixels.h5")?; // open for writing
    let group = file.create_group("dir")?; // create a group
    let builder = group.new_dataset_builder();
    let ds = builder
        .with_data(&arr1(&[
            Pixel::new(1, 2, R), Pixel::new(2, 3, B),
            Pixel::new(3, 4, G), Pixel::new(4, 5, R),
            Pixel::new(5, 6, B), Pixel::new(6, 7, G),
        ]))
        // finalize and write the dataset
        .create("pixels")?;
    Ok(())
}

fn read_hdf5() -> Result<()> {
    use Color::*;
    let file = File::open("pixels.h5")?; // open for reading
    let ds = file.dataset("dir/pixels")?; // open the dataset

    // How can I read all `x` of Pixels?
    assert_eq!(
        ds.read_slice::<Pixel, _, _>(s![2..])?,
        arr1(
            &[Pixel::new(3, 4, G), Pixel::new(4, 5, R), Pixel::new(5, 6, B), Pixel::new(6, 7, G),]
        )
    );
    Ok(())
}

fn main() -> Result<()> {
    write_hdf5()?;
    read_hdf5()?;
    Ok(())
}

Question

In python I can do this to read all x values:

import h5py
f = h5py.File("pixels.h5")
f["dir/pixels"]['x']    # read all x values from pixels

What's hdf5-rust way to achieve the same behavior? Sorry that I have no luck to find a proper solution.

mulimoen commented 8 months ago

This crate does not support reading just a single compund field (PRs welcome) as one can do in h5py. You will have to read the compund type/struct and extract after reading.

If you are in charge of data creation I would reccommend storing each field as a dataset if it makes sense for your application. SoA (struct of arrays) may be beneficial over AoS (array of structs).

WindSoilder commented 8 months ago

Thanks! Knowing that is important, currently I can try to work around it.