delmic / odemis

Open Delmic Microscope Software
GNU General Public License v2.0
46 stars 38 forks source link

hdf5 file_format #319

Open thomasaarholt opened 7 years ago

thomasaarholt commented 7 years ago

Currently, there is nothing that clearly distinguishes a .h5 file exported by odemis from any other hdf5 file. In order to import such a file into other software, it is very helpful to include some marker that implies that "this file is a format created by odemis", so that the file can be imported in the appropriate way. (A "hacky way" to do this would be to check if the name of the first group in the file is "Acquisition0", or something similar).

Over at Hyperspy, we save our hdf5 files with file_format = Hyperspy and file_format_version = 2.2 metadata. This enables any file reader to quickly recognise the file type and treat the rest of the data as such.

The code is simple enough - I'll be happy to make a PR.

def file_writer(filename,
                signal,
                *args, **kwds):
    with h5py.File(filename, mode='w') as f:
        f.attrs['file_format'] = "HyperSpy"
        f.attrs['file_format_version'] = version
pieleric commented 7 years ago

What do you suggest to implement? Extending Odemis HDF5 loader to support also HyperSpy format, or to extend the HDF5 writer to also write HyperSpy compatible metadata?

In the first case, any patch is welcome. In the second case, I'd rather like to avoid having two HDF5 sub-formats, between which the user would need to choose when saving the file. However, if it's possible to save both the current metadata and the HyperSpy metadata in the same file, there wouldn't be any issue to accept such an extension.

thomasaarholt commented 7 years ago

Sorry, I was clearly distracted whilst writing. I've edited the post clarifying the following:

I would like to read the odemis hdf5 file using Hyperspy. However, in order to be able to distinguish the odemis file from any other generic somefile.h5, it needs some metadata telling the file reader what the file is. Instead of checking for some pecularity within the format, it's much nicer to have something like: file_format = odemis in the metadata of the root of the file.

thomasaarholt commented 7 years ago

To answer your final sentence - The addition of the metadata (referred to as an "attribute") should not cause any issue. It is simply a line in addition to the currently existing metadata, which (found by opening my odemis .h5 in HDFView) is currently:

group_size = 4
Number of attributes = 0
pieleric commented 7 years ago

I understand now :-) Yes, that sounds fine. We currently have something a little similar in the "SVIData" group, but it's per acquisition. So you could add another attribute at the root to make it easier.

Precisely, we are using the file format proposed by SVI, with some extensions to support some extra metadata. So you could put an attribute file_format = SVI or file_format = SVI/Odemis.

To add it, you can create a new function in odemis.dataio.hdf5, similar to _add_svi_info(), and call it somewhere at the beginning of _saveAsHDF5().

thomasaarholt commented 7 years ago

Great! Will do.

And well done for using hdf5!

On Wed, 19 Apr 2017 at 17:08, Éric Piel notifications@github.com wrote:

I understand now :-) Yes, that sounds fine. We currently have something a little similar in the "SVIData" group, but it's per acquisition. So you could add another attribute at the root to make it easier.

Precisely, we are using the file format proposed by SVI, with some extensions to support some extra metadata. So you could put an attribute file_format = SVI or file_format = SVI/Odemis.

To add it, you can create a new function in odemis.dataio.hdf5, similar to _add_svi_info(), and call it somewhere at the beginning of _saveAsHDF5().

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/delmic/odemis/issues/319#issuecomment-295303261, or mute the thread https://github.com/notifications/unsubscribe-auth/ACmGj5WMsm-1KMIPfeNsF9WZx8TdtuYsks5rxiNmgaJpZM4NBnpI .