EricThomson / tiffit

Do simple things with multi-page tiff files
2 stars 0 forks source link

dimensionality error #2

Open vkonan opened 1 month ago

vkonan commented 1 month ago

Hi Eric,

I tried using this to manage a tiff file that was not being read properly into an mmap by mesmerize-core. However, it didn't seem to fix the issue.

The original apparent problem was that the tiffs I generated seemed to contain an extra time dimension (see image) even though this file is NOT a volumetric tiff. Here is the tiff file for your reference. Screenshot 2024-09-25 at 10 52 47 AM

I'm not sure why this means that the tiff (which is produced by ScanImage) is not memmappable. As a result, mesmerize core throws this error:

[/opt/miniconda3/envs/mescore/lib/python3.11/site-packages/ipydatagrid/datagrid.py:512](http://localhost:8888/opt/miniconda3/envs/mescore/lib/python3.11/site-packages/ipydatagrid/datagrid.py#line=511): UserWarning: Index name of 'index' is not round-trippable.
  schema = pd.io.json.build_table_schema(dataframe)
[/opt/miniconda3/envs/mescore/lib/python3.11/site-packages/mesmerize_core/movie_readers.py:22](http://localhost:8888/opt/miniconda3/envs/mescore/lib/python3.11/site-packages/mesmerize_core/movie_readers.py#line=21): FutureWarning: You are trying to use the following experimental feature, this may change in the future without warning:
tiff_lazyarray
This feature is new and might change in the future

  movie = tiff_lazyarray(path, **kwargs)
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
Cell In[38], line 9
      6 ...
      8 # viz = df.mcorr.viz(input_movie_kwargs={"reader": tifffile.imread})
----> 9 viz = df.mcorr.viz()
     11 # if you want to get the input movie
     12 # viz = df.iloc[0].caiman.get_input_movie(reader=tifffile.imread)
     13 viz.show(sidecar=True)

File [/opt/miniconda3/envs/mescore/lib/python3.11/site-packages/mesmerize_viz/_mcorr.py:446](http://localhost:8888/opt/miniconda3/envs/mescore/lib/python3.11/site-packages/mesmerize_viz/_mcorr.py#line=445), in MCorrDataFrameVizExtension.viz(self, data_options, start_index, reset_timepoint_on_change, input_movie_kwargs, image_widget_kwargs, data_grid_kwargs)
    396 def viz(
    397         self,
    398         data_options: List[str] = None,
   (...)
    403         data_grid_kwargs: dict = None,
    404 ):
    405     """
    406     Visualize motion correction output.
    407 
   (...)
    443         widget that contains the DataGrid, params text box and ImageWidget
    444     """
--> 446     container = McorrVizContainer(
    447         dataframe=self._dataframe,
    448         data_options=data_options,
    449         start_index=start_index,
    450         reset_timepoint_on_change=reset_timepoint_on_change,
    451         input_movie_kwargs=input_movie_kwargs,
    452         image_widget_kwargs=image_widget_kwargs,
    453         data_grid_kwargs=data_grid_kwargs
    454     )
    456     return container

File [/opt/miniconda3/envs/mescore/lib/python3.11/site-packages/mesmerize_viz/_mcorr.py:218](http://localhost:8888/opt/miniconda3/envs/mescore/lib/python3.11/site-packages/mesmerize_viz/_mcorr.py#line=217), in McorrVizContainer.__init__(self, dataframe, data_options, start_index, reset_timepoint_on_change, input_movie_kwargs, image_widget_kwargs, data_grid_kwargs)
    215 # set the initial widget state with the start index
    216 data_arrays = self._get_row_data(index=start_index)
--> 218 self._image_widget = ImageWidget(
    219     data=data_arrays,
    220     names=self._data_options,
    221     **self.image_widget_kwargs
    222 )
    224 # mean window slider
    225 self._slider_mean_window = IntSlider(
    226     min=1,
    227     step=2,
   (...)
    231     description_tooltip="set a mean rolling window"
    232 )

File [/opt/miniconda3/envs/mescore/lib/python3.11/site-packages/fastplotlib/widgets/image.py:333](http://localhost:8888/opt/miniconda3/envs/mescore/lib/python3.11/site-packages/fastplotlib/widgets/image.py#line=332), in ImageWidget.__init__(self, data, dims_order, slider_dims, window_funcs, frame_apply, grid_shape, names, grid_plot_kwargs, histogram_widget, **kwargs)
    330 # verify that all image arrays have same number of dimensions
    331 # sliders get messy otherwise
    332 if not len(set(_ndim)) == 1:
--> 333     raise ValueError(
    334         f"Number of dimensions of all data arrays must match, your ndims are: {_ndim}"
    335     )
    337 self._data: List[np.ndarray] = data
    338 self._ndim = self.data[0].ndim  # all ndim must be same

ValueError: Number of dimensions of all data arrays must match, your ndims are: [4, 3, 3, 3]
EricThomson commented 1 month ago

Hey good to hear from you @vkonan

Hmm this is strange: is mesmerize adding an additional dimension during a processing step or does scanimage have two channels interleaved, and mesmerize is doing its best to figure out what to do with them?

I built tiffit basically to handle ImageJ weirdness, but I would definitely want it to work with scanimage outputs. That said, if the problem is interleaved dual channels then I would first de-interleave to preprocess and then handle them via tiffit or whatever (if that were still needed).

But I'm not sure what is going on, I've never used scanimage 😬

EricThomson commented 1 month ago

I'm thinking it might be useful to add a diagnostic() function to print out all the metadata from a given tiff in a stack, rather than just the selected bits that I decided were most useful. It would be a lot of information, but sometimes when things are fubar, this is what you want. I'm pretty busy right now but I could get to it in the next couple of weeks.

This would be really useful for debugging annoying tiff files. 😄 It comes up every year or so.

So it would be something like tiffit diagnose filename.tif output.txt or something like that so it would save to a txt file. Anyway, I'm open to ideas. @kushalkolar (Note I don't want to make anything too complicated I want to keep it in the spirit of tiffit which is lightweight command-line tooling)

vkonan commented 1 month ago

Thank you for the input! Yes, the diagnostic() function would be really helpful since I have no idea what the guts of a tiff look like.

To answer your question about the file shape, ScanImage puts out an interleaved file but the file I fed to mesmerize is a deinterleaved .tiff. I read this file in matlab using tiffreadVolume and it shows up as a <256,256,700>.

So it might be that the deinterleaving step leaves some sticky metadata untouched and screws up how mesmerize reads in the tiff. Only the diagnostic() function might unravel this mystery.

Looking forward to your update (whenever it comes). Thank you so much for making this useful tool!

kushalkolar commented 1 month ago

I'm confused about what you tried to do. All that you need to do is convert your original tiff file to a new tiff file with tiffit and use this new file in mesmerize.

EricThomson commented 1 month ago

Ah I assumed that had been done. Yes, try that first. 😄

vkonan commented 1 month ago

@kushalkolar Yep, that's exactly what I did. I ran the tiff file through tiffit. Then fed it to mesmerize. However, the conversion is not resolving the issue. I don't even know what the issue is. All I can tell that is "off" is that there is an extra dimension, but there shouldn't be as evidenced by reading the tiff in MATLAB.

Just some more info...

wanted to see if I could find some metadata on this file in MATLAB. This is the original tiff file:

t = 

                  TIFF File: '/Users/vkonanur/Library/CloudStorage/Dropbox/01_POSTDOC/DATA/CaImAn/ZAKdata/20240924/sess_1/stitched_channel_1.tiff'
                       Mode: 'r'
    Current Image Directory: 1
           Number Of Strips: 4
                SubFileType: Tiff.SubFileType.Default
                Photometric: Tiff.Photometric.MinIsBlack
                ImageLength: 256
                 ImageWidth: 256
               RowsPerStrip: 64
              BitsPerSample: 32
                Compression: Tiff.Compression.AdobeDeflate
               SampleFormat: Tiff.SampleFormat.Int
            SamplesPerPixel: 1
        PlanarConfiguration: Tiff.PlanarConfiguration.Chunky
                Orientation: Tiff.Orientation.TopLeft

Here is the tiffit'ed file:

t = 

                  TIFF File: '/Users/vkonanur/Library/CloudStorage/Dropbox/01_POSTDOC/DATA/CaImAn/ZAKdata/20240924/sess_1/stitched_channel_1_tiffed.tiff'
                       Mode: 'r'
    Current Image Directory: 1
           Number Of Strips: 1
                SubFileType: Tiff.SubFileType.Default
                Photometric: Tiff.Photometric.MinIsBlack
                ImageLength: 256
                 ImageWidth: 256
               RowsPerStrip: 256
              BitsPerSample: 32
                Compression: Tiff.Compression.None
               SampleFormat: Tiff.SampleFormat.Int
            SamplesPerPixel: 1
        PlanarConfiguration: Tiff.PlanarConfiguration.Chunky
           ImageDescription: {"shape": [700, 256, 256]}
                Orientation: Tiff.Orientation.TopLeft
EricThomson commented 1 month ago

Ok I assumed as much. It is really bizarre: pre-tiffit, imageJ doesn't even know what to do with that thing. Post-tiffit it's like Ok this is a movie and I can view it, and it seems ok. I'm very curious why it is giving problems.

vkonan commented 1 month ago

I wanted to check my sanity after kushal's comment... I ran through the py script again and I stupidly made the mistake of reading the wrong iloc from the dataframe. The post-tiffit file does in fact work and is memmapped by mesmerize. I'm so sorry for wasting your time!

at least you have another case where tiffit works! 😄

Screenshot 2024-09-25 at 1 37 59 PM

EricThomson commented 1 month ago

Ok that's good to know I was getting confused. But this is a good motivation to add a more detailed diagnostic functionality that just gives all the metadata. I'll work on that when I get time. It's typically just too much, but I can provide it (and for all pages in the tiff file). It will be useful for debugging purposes.

EricThomson commented 1 month ago

I'll leave this open and will close when the PR is made this will serve as motivation. 🚀

kushalkolar commented 1 month ago

@EricThomson If you can figure out how to get the dims of a tiff file a-priori without reading it all into RAM it would be very useful! I tried this and that's what the mescore LazyTiff reader tries to do when the tiff file is not memmapable (why can't microscope companies just output memmapable tiffs or other sensible formats :weary: ... anyways).

https://github.com/nel-lab/mesmerize-core/blob/5d7c9b3ebe121eb641addccb6acdc8119d14a1d8/mesmerize_core/arrays/_tiff.py#L26-L40

Parsing the levels, series, and pages was not trivial, tiff just gives too much flexibility to store the data in all sorts of ways. But if someone could figure it out it would be useful to have a lazy reader for non-memmapable files :smile:

EricThomson commented 1 month ago

@kushalkolar tiffit info filename returns the number of images, dimensions of the first frame (rows x cols), byte depth (e.g., uint16), whether it's in ImageJ format (bool), and in imageJ format (bool), without loading in memory.

Those were the most useful things to reveal, or so I thought when I first made it, as most people minimally want to know how big things are . Plus once tiffit convert has been run, ImageJ and BigTiff should be False and True respectively.

kushalkolar commented 1 month ago

What about number of images per dimension? That might have been the issue I ran into but I can't remember, I gave up on it years ago