clbarnes / jeiss-convert

Convert Jeiss .dat files to HDF5
MIT License
2 stars 2 forks source link

store dat header and footer blocks as datasets instead of attributes #9

Open trautmane opened 1 year ago

trautmane commented 1 year ago

I get the following error when trying to convert a "padded" v9 dat:

  File "/Users/trautmane/opt/miniconda3/envs/janelia_emrp/lib/python3.9/site-packages/jeiss_convert/hdf5.py", line 66, in dat_to_hdf5
    g.attrs.update(meta)
  File "/Users/trautmane/opt/miniconda3/envs/janelia_emrp/lib/python3.9/_collections_abc.py", line 941, in update
    self[key] = other[key]
  File "h5py/_objects.pyx", line 54, in h5py._objects.with_phil.wrapper
  File "h5py/_objects.pyx", line 55, in h5py._objects.with_phil.wrapper
  File "/Users/trautmane/opt/miniconda3/envs/janelia_emrp/lib/python3.9/site-packages/h5py/_hl/attrs.py", line 103, in __setitem__
    self.create(name, data=value)
  File "/Users/trautmane/opt/miniconda3/envs/janelia_emrp/lib/python3.9/site-packages/h5py/_hl/attrs.py", line 196, in create
    attr = h5a.create(self._id, self._e(tempname), htype, space)
  File "h5py/_objects.pyx", line 54, in h5py._objects.with_phil.wrapper
  File "h5py/_objects.pyx", line 55, in h5py._objects.with_phil.wrapper
  File "h5py/h5a.pyx", line 50, in h5py.h5a.create
RuntimeError: Unable to create attribute (object header message is too large)

I believe the error occurs because jeiss_convert tries to store the footer as an hdf5 attribute and there is a 64K limit on attribute values. In my prototype hdf5 code, I stored both the footer and header as data sets instead of attributes to work around this issue. I suggest that jeiss_convert does the same.

The "padded" dat files are a special case we discovered at Janelia last year when we happened to get a data set with tiles that were 8250 pixels in height. In a August 17, 2022 email, Shan explained

The YResolution is calculated based on the Y dimension (in µm) and pixel size (in nm). However, it is rounded to multiples of 4 during acquisition due to some issues to synchronize the high speed NI cards. The dat to tif script will exclude those additional lines.

If the padded lines get included in the footer block (which makes sense to me), the footer is too big to store as an hdf5 attribute.

To help with testing, I have uploaded the following two v9 dat files (from different data sets) to HHMI's OneDrive: Merlin-6284_22-07-15_000050_0-0-0.dat : 96MB, 5000 x 5000, height divisible by 4, works with jeiss_convert Merlin-6262_22-06-15_155134_0-0-0.dat : 288MB, 9125 x 8250, height NOT divisible by 4, breaks jeiss_convert

I'm not sure if the links are permanent or if they expire at some point - so you may want to download the files sooner rather than later. Let me know if you need me to generate new links.

I'm also guessing that that the padding occurs in v8 as well - but I'm not sure about that.

clbarnes commented 1 year ago

Good point, I was tossing up whether to include the header/footer as attributes or datasets. I settled on attributes only because it made it marginally more convenient to iterate through channels; but if they're prepended with _ then they're easy enough to exclude anyway, and this iteration behaviour would be broken by the CSV-derived additional_metadata anyway.

That padding is not something I'd come across, and good to know about. It should be added to the jeiss-specs README. I'm not entirely clear on where the rounding comes in - do the YResolution and XResolution attributes still correctly describe the true image size, with no padding visible if you read ChannelNum * YResolution * XResolution bytes? But the length of the reserved memory block between the end of the header and the start of the recipe is ChannelNum * round_up_to_multiple_of_4(YResolution) * round_up_to_multiple_of_4(XResolution)?

If we do included the padding in the footer, it would also be nice to include the offset into that footer that the recipe starts (which presumably can be calculated with ChannelNum, YResolution, XResolution, and FileLength), just in case anyone has a need/method for decoding it.

clbarnes commented 1 year ago

Fixed in https://github.com/clbarnes/jeiss-convert/commit/6f912da2f62bfa3d3c364052d381cd8bc1fc05af