`recorder convert` throws `IndexError` for a specific (incomplete?) dataset

talonchandler commented 1 year ago

@edyoshikun and @dsundarraman reported that:

recorder convert --input /hpc/projects/comp_micro/rawdata/falcon/zebrafish/2023_02_09_casper_Mpeg_GFP_wounded_unwounded/swing0p08_tails_3 --output ~/sandbox/test-output.zarr

fails with this trace:

recOrder: Computational Toolkit for Label-Free Imaging

Initializing Data...
Finished initializing data
Found Dataset test4.zarr w/ dimensions (P, T, C, Z, Y, X): (13, 30, 6, 111, 2048, 2048)
Creating new zarr store at /home/talon.chandler/sandbox/test4.zarr
Running Conversion...
Setting up zarr
Converting Images...
Status: |                                                                                         |0/259740 (Time Remaining: ?), ?it/s]
Traceback (most recent call last):
  File "/home/talon.chandler/.conda/envs/recorder-test/bin/recorder", line 8, in <module>
    sys.exit(cli())
  File "/home/talon.chandler/.conda/envs/recorder-test/lib/python3.9/site-packages/click/core.py", line 1130, in __call__
    return self.main(*args, **kwargs)
  File "/home/talon.chandler/.conda/envs/recorder-test/lib/python3.9/site-packages/click/core.py", line 1055, in main
    rv = self.invoke(ctx)
  File "/home/talon.chandler/.conda/envs/recorder-test/lib/python3.9/site-packages/click/core.py", line 1657, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
  File "/home/talon.chandler/.conda/envs/recorder-test/lib/python3.9/site-packages/click/core.py", line 1404, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/home/talon.chandler/.conda/envs/recorder-test/lib/python3.9/site-packages/click/core.py", line 760, in invoke
    return __callback(*args, **kwargs)
  File "/home/talon.chandler/recOrder/recOrder/scripts/cli.py", line 146, in convert
    converter.run_conversion()
  File "/home/talon.chandler/recOrder/recOrder/io/zarr_converter.py", line 461, in run_conversion
    plane_meta = self._generate_plane_metadata(tf, page)
  File "/home/talon.chandler/recOrder/recOrder/io/zarr_converter.py", line 257, in _generate_plane_metadata
    for tag in tiff_file.pages[page].tags.values():
  File "/home/talon.chandler/.conda/envs/recorder-test/lib/python3.9/site-packages/tifffile/tifffile.py", line 6442, in __getitem__
    return getitem(key)
  File "/home/talon.chandler/.conda/envs/recorder-test/lib/python3.9/site-packages/tifffile/tifffile.py", line 6414, in _getitem
    self._seek(key)
  File "/home/talon.chandler/.conda/envs/recorder-test/lib/python3.9/site-packages/tifffile/tifffile.py", line 6317, in _seek
    raise IndexError('index out of range')
IndexError: index out of range

@edyoshikun you said that you've done some initial debugging on this issue? And that the acquisition may be incomplete (crash before completion?)?

@ziw-liu I'm tagging you in case this issue affects you and iohub.

edyoshikun commented 1 year ago

From the brief debugging I did, the bug points to page = self.reader.reader.coord_map[coord_reorder][1] line 458 in zarr_converter file. It starts the count of the pages at 511and then process page by page sequentially (i.e 1,2,3).

Screenshot 2023-02-14 at 5 20 11 PM

edyoshikun commented 1 year ago

One thing to mention is that this dataset failed to finish.

ziw-liu commented 1 year ago

This is caused by the multi-page reader in waveorder.io building index map from MM metadata. It expects the pages recorded in the MM header field (generated at the start of the acquisition) to be present, but since the acquisition did not finish, the index map is larger than the actual dataset.

I'll try to fix this in iohub.

ziw-liu commented 1 year ago

The bug appears to be deeper in the stack: the OME-TIFF reader was constructing index maps using a magic-number offset which is inconsistent with what MM actually writes data with.

Edit: Both page indexing and byte offsets have some problems, and the later seems to be originating from the former.

talonchandler commented 1 year ago

Ready to close when we release iohub.

mehta-lab / recOrder

`recorder convert` throws `IndexError` for a specific (incomplete?) dataset #320