pdidev / pdi

The PDI Data Interface
https://pdi.dev
BSD 3-Clause "New" or "Revised" License
5 stars 1 forks source link

Read an array from a HDF5 file in one step #472

Open EmilyBourne opened 1 week ago

EmilyBourne commented 1 week ago

Currently in order to read an array from a HDF5 file we need to do:

metadata:
  grid_x_extents: { type: array, subtype: int64, size: 1 }
  grid_x:
    type: array
    subtype: double
    size: [ '$grid_x_extents[0]' ]
plugins:
  decl_hdf5:
    - file: 'my_file.h5'
      on_event: [read_grid_x_extents]
      read:
        grid_x_extents: {size_of: grid_x}
    - file: 'my_file.h5'
      on_event: [read_grid_x]
      read:
        grid_x: ~
    std::vector<size_t> n_points(1);
    PDI_multi_expose(
            "read_grid_extents",
            "grid_extents",
            n_points.data(),             
            PDI_INOUT, 
            NULL);

    std::vector<Coord1D> mesh(n_points[0]);                                               
    PDI_multi_expose(
            "read_grid",
            "grid",
            mesh.data(),
            PDI_INOUT,
            NULL);

The step with n_points seems excessive. Would it be possible to read the array in one step, e.g.:

metadata:
  grid_x_extents: { type: array, subtype: int64, size: 1 }
  grid_x:
    type: array
    subtype: double
    size: [ '$grid_x_extents[0]' ]
plugins:
  decl_hdf5:
    - file: 'my_file.h5'
      on_event: [read_grid_x]
      read:
        grid_x: ~
    Coord1D* mesh; 
    PDI_multi_expose(
            "read_grid",
            "grid",
            mesh,
            PDI_INOUT,
            NULL);

This would be even better if a C++ argument could be passed in place of the pointer

jbigot commented 1 week ago

You would like PDI to allocate the memory for you? This opens the question of which allocator to use. But definitely something that would make sense to offer our users. Thanks for the idea.