Closed jchelly closed 5 years ago
If I may chip in, that would also enable to use the parallel-hdf5 library for i/o, which could considerably speed up the use of VR on SWIFT snapshots for post-processing.
I think the option is to write a wrapper function for writing to HDF5 and that wrapper function uses the C API but is easy enough to alter and enable the C++ API.
@pelahi - exactly. Looking at io.cxx there are a lot of hdf5 calls but they are almost all just writing an array to a new dataset, so if we have a function to do that then that the output side is mostly taken care of. Parallel input as @MatthieuSchaller suggests would be more work. Maybe that should be another feature request.
I wrote a bit of code to do HDF5 output from HBT that might be useful for this. I'll take a look at adapting it.
There's also the hdf5 input/output classes we use in shark (https://github.com/ICRAR/shark/tree/master/include/hdf5 and https://github.com/ICRAR/shark/tree/master/src/hdf5). They are built on top of the HDF5 C++ bindings, but they don't expose that detail to the caller, who only issues simple calls (e.g., https://github.com/ICRAR/shark/blob/master/src/galaxy_writer.cpp#L96-L161 and others in the same file).
@rtobar that would not directly solve the issue as far as I understand it as the bindings will still require a C++-enabled build of HDF5, which directly conflicts with the use of parallel-hdf5 in other parts of the code.
Yes, sorry for the confusion. When I said that the shark classes don't "expose that detail" I was actually VERY implicitly implying (my bad!) that their internal usage of the hdf5 lib could be rewritten to use the C API instead, while maintaining the ease of use for callers. So yes, not fully "just plug and play", but I thought it would be useful to get some ideas.
I've uploaded a branch with an incomplete attempt at the type of changes I had in mind: https://github.com/jchelly/VELOCIraptor-STF/tree/swift_hdf5_nocxx . This adds a H5OutputFile class with write_dataset and write_attribute methods implemented using the HDF5 C API and I've converted some of the calls in io.cxx.
VR updated to use HDF C API.
Is your feature request related to a problem? Please describe.
When building the SWIFT interface we need to link to the same HDF5 installation that SWIFT is using. We need HDF5 with MPI support (for SWIFT) and the C++ interface (for Velociraptor). This combination is officially unsupported so it's not tested by the HDF5 developers and we usually wont be able to use pre-existing installations at HPC centres.
Describe the solution you'd like
We could make it possible to build the SWIFT interface without HDF5 C++ support. There's no need to modify the HDF5 snapshot reading routines because they could just be disabled if we're running within SWIFT. We would only need to modify the code that writes out the group information. Probably that could be done by making a small class or function for writing datasets via the C interface then using that instead of the C++ interface. At the moment there's a lot of duplication of HDF5 calls in io.c so it might be an improvement to readability too.
Describe alternatives you've considered
The SWIFT interface does build with HDF5 disabled entirely but then we're stuck with binary output.