[Bug]: Incorrect bruker metadata parsing for single plane imaging

catalystneuro / roiextractors

Python-based module for extracting from, converting between, and handling optical imaging data from several file formats. Inspired by SpikeInterface.

https://roiextractors.readthedocs.io/en/latest/index.html

BSD 3-Clause "New" or "Revised" License

11 stars 7 forks source link

[Bug]: Incorrect bruker metadata parsing for single plane imaging #341

Open mhturner opened 2 months ago

mhturner commented 2 months ago

What happened?

Using the Bruker->NWB converter in neuroconv (as here), I noticed that my single-plane Bruker imaging data was being flagged as multiplane.

Tracked this down to brukertiffimagingextractor._determine_imaging_is_volumetric(), which assumes that the metadata flag 'zDevice' == 1 means this is a volumetric scan, which is a sensible thing to assume. However my single plane image data as well as my multiplane image data both have this value as 1.

In my experience, a reliable way to tell what the scan type is from the .xml metadata is the Sequence type. This can be things like: 'TSeries Timed Element' - XYT image data 'TSeries ZSeries Element' - XYZT image data 'ZSeries' - XYZ data (not a time series) 'Single' - XY image

and probably other things too that I don't acquire.

Steps to Reproduce

from roiextractors.extractors.tiffimagingextractors.brukertiffimagingextractor import _determine_imaging_is_volumetric

print(_determine_imaging_is_volumetric(bruker_file_dir))

Traceback

True

Operating System

macOS

Python Executable

Conda

Python Version

3.10

Package Versions

No response

Code of Conduct

Yes

Duplicated Issue Check

CodyCBakerPhD commented 2 months ago

Thanks for finding this and including all the details!

attn @weiglszonja @alessandratrapani @pauladkisson

weiglszonja commented 2 months ago

Thank you for opening this issue @mhturner! I'm thinking that since it is not obvious to decide whether it is a volumetric type, how about we just don't assume and separate the classes to single and multiplane, and then the user decides what is the more appropriate extractor/interface for the data. I think currently there is a routing in the converter that returns the multiplane if _determine_imaging_is_volumetric is True. I'm thinking a refactor similar to ScanImage. What do you guys think? @pauladkisson @alessandratrapani @CodyCBakerPhD

In the meantime, @mhturner can you try using neuroconv.datainterfaces.BrukerTiffSinglePlaneImagingInterface to see if it works for your data? Let me know if you need help with it.

h-mayorquin commented 2 months ago

Thanks for submiting this, this is indeed a bug. This is my fault btw, it is me who introduced this assumption in a recent refactor I think.

@mhturner would you be so kind to share the corresponding xml of your data?

The volumetric data that I have seen has a specification that looks like this:

      <PVStateShard>
        <PVStateValue key="framePeriod" value="0.048474237" />
        <PVStateValue key="laserPower">
          <IndexedValue index="0" value="650.390625" description="920 Axon" />
          <IndexedValue index="1" value="0" description="1064 Axon" />
          <IndexedValue index="2" value="0" description="Monaco" />
        </PVStateValue>
        <PVStateValue key="positionCurrent">
          <SubindexedValues index="XAxis">
            <SubindexedValue subindex="0" value="14.927" />
          </SubindexedValues>
          <SubindexedValues index="YAxis">
            <SubindexedValue subindex="0" value="56.215" />
          </SubindexedValues>
          <SubindexedValues index="ZAxis">
            <SubindexedValue subindex="0" value="166.575" description="Z Focus" />
            <SubindexedValue subindex="1" value="130" description="Optotune ETL 10-30" />
          </SubindexedValues>

The volumetric example that we have in the gin data is also like this.

If @mhturner were to confirm that this type of specification is present on its xml file then we could use this to change the function for volumetric data instead.

mhturner commented 2 months ago

Here is a corresponding part of a metadata file. This is from an XYT scan. Let me know if you would like me to send you this file (can't attach .xml here).

h-mayorquin commented 2 months ago

Yes, can you send me the file? one option is my email, it is just my user name here in github but with a dot instead of an underscore. Gmail.

h-mayorquin commented 2 months ago

Thanks a lot @mhturner. I double checked my files and while the xml do have the information that I listed here I think your solution has the virtue of being way way simpler. So as long as we don't have the documentation of Praire view's xmls I think we will have to run with this heuristics and let's start with yours.

I am slightly against the idea of refactoring this to put this on the user side. I think that we should automize as much as we can. On the other hand, I think that we if we keep getting this type of errors then @weiglszonja solution should become more favorly. I suggest the following.

We refactor the code so the classes are clearly separated as you guys propose.
We keep the current extractor as a router that tries to guess with some heuristics.
If the error arrises then we can tell the user to use an explicit extractor/interface.
If we keep getting too many errors, then we get rid of the router class.

h-mayorquin commented 2 months ago

Hey @mhturner, any chance you have an xml corresponding to a a TSeries ZSeries Element with two channels?

mhturner commented 2 months ago

Sure - here is a dropbox link with such a file: https://www.dropbox.com/scl/fi/ka2j71jakd9xc0vje4ys1/TSeries-20231114-001.xml?rlkey=rm7885ktloo9qiri7l4nc6rr3&dl=0

h-mayorquin commented 2 months ago

Thanks a bunch for the second file @mhturner . I implemented your suggestion and now is on main. I will keep this open until you confirm that the fix works for you : )