frejanordsiek / hdf5storage

Python package to read and write a wide range of Python types to/from HDF5 formatted files. Can read/write data to the HDF5 based Matlab v7.3 MAT files.
BSD 2-Clause "Simplified" License
83 stars 24 forks source link

Moving many utilities functions into a class that is passed to all Marshallers #106

Closed frejanordsiek closed 3 years ago

frejanordsiek commented 3 years ago

Passing around the h5py.File and hdf5storage.Options objects between the different utility functions in utilities and the Marshallers is unwieldy and also means a lot of effort is duplicated when writing many object arrays (need to get the references group each time and do random name generation collision checks constantly). They will be wrapped into a class that is passed around instead.

frejanordsiek commented 3 years ago

Changed in commit 8839b6e

So, the read_data, write_data, read_object_array, write_object_array, and next_unused_name_in_group functions have been wrapped into the class LowLevelFile that is made by File and then used (contains the file handle and the options). It then passes itself to the Marshallers as needed. This object takes the place of the f argument (first argument) for the methods of all Marshallers and the options argument is removed from all of them. The options can be gotten by f.options and the file handle by f.f. All of the methods also have their f and options arguments removed.

next_unused_name_in_group was renamed to next_unused_ref_group_name in the process since it is now specialized only for the references Group.

One advantage of wrapping this in a class is that it need only check for the presence of the references Group and the canonical empty Dataset once, and if it created the Group it can instead increment names by a counter instead of trying random names and checking that they aren't already present which is a lot slower.