catalystneuro / roiextractors

Python-based module for extracting from, converting between, and handling optical imaging data from several file formats. Inspired by SpikeInterface.
https://roiextractors.readthedocs.io/en/latest/index.html
BSD 3-Clause "New" or "Revised" License
12 stars 7 forks source link

Discuss Miniscope input folder expected structure #356

Open h-mayorquin opened 1 month ago

h-mayorquin commented 1 month ago

Currently in the gin example file we have this structure:

    C6-J588_Disc5/ (main folder)
    ├── 15_03_28/ (subfolder corresponding to the recording time)
    │   ├── Miniscope/ (subfolder containing the microscope video stream)
    │   │   ├── 0.avi (microscope video)
    │   │   ├── metaData.json (metadata for the microscope device)
    │   │   └── timeStamps.csv (timing of this video stream)
    │   ├── BehavCam_2/ (subfolder containing the behavioral video stream)
    │   │   ├── 0.avi (bevavioral video)
    │   │   ├── metaData.json (metadata for the behavioral camera)
    │   │   └── timeStamps.csv (timing of this video stream)
    │   └── metaData.json (metadata for the recording, such as the start time)
    ├── 15_06_28/
    │   ├── Miniscope/
    │   ├── BehavCam_2/
    │   └── metaData.json
    └── 15_12_28/

And in the test we pass to the extractor the folder that is at the top level:

https://github.com/catalystneuro/roiextractors/blob/11c759c4d550d2b177c8b5655202ef8532016d99/tests/test_miniscopeimagingextractor.py#L18-L19

However, this structure with a parent folder that contains multiple (and only) datasets of miniscope is not available in my current working project. There I have:

├── 10_11_24
│   ├── metaData.json
│   ├── Miniscope
│   │   ├── 0.avi
│   │   ├── 1.avi
│   │   ├── 2.avi
│   │   ├── 3.avi
│   │   ├── 4.avi
│   │   ├── 5.avi
│   │   ├── 6.avi
│   │   ├── 7.avi
│   │   ├── 8.avi
│   │   ├── headOrientation.csv
│   │   ├── metaData.json
│   │   ├── minian.mp4
│   │   └── timeStamps.csv
│   └── notes.csv
├── Ca_EEG2-1_FC_FreezingOutput.csv
├── Ca_EEG2-1_FC.raw
├── Ca_EEG2-1_FC.txt
└── Ca_EEG2-1_FC.wmv

It appears to me that the natural thing to pass to the extractor is the folder where the videos are. Why are we passing the parent folder? I tried to look for documentation of where this output is coming from without luck here:

https://github.com/Aharoni-Lab/Miniscope-DAQ-QT-Software

@weiglszonja @pauladkisson do you know? how do you think about this?

weiglszonja commented 1 month ago

The reason for passing the parent folder was because the example that we have on gin consists of multiple subfolders with avi files that belong to the same recording session.

On the other hand if we provided one of the subfolders like Path(OPHYS_DATA_PATH / "imaging_datasets" / "Miniscope" / "C6-J588_Disc5" / "15_03_28" ) I would expect the extractor should still work, and consider it a bug if it doesn't.

@h-mayorquin are you suggesting this is the case with your example?

FYI @alessandratrapani, I believe you also have Miniscope data in your current conversion.

h-mayorquin commented 1 month ago

On the other hand if we provided one of the subfolders like Path(OPHYS_DATA_PATH / "imaging_datasets" / "Miniscope" / "C6-J588_Disc5" / "15_03_28" ) I would expect the extractor should still work, and consider it a bug if it doesn't.

Yes, that does not work for the data of Cai.

The reason for passing the parent folder was because the example that we have on gin consists of multiple subfolders with avi files that belong to the same recording session.

Yeah, I disagree with this. I think that it would be better if the extractor was more specific. The user can add two interfaces for each collection of raw data if so they desire. Otherwise, we don't have the option of not adding some of the data.

pauladkisson commented 1 month ago

I think that it would be better if the extractor was more specific.

In particular, these extractors should be scoped per-session, but it looks like this one takes a folder path with multiple sessions (dates), so it probably needs to be changed.

h-mayorquin commented 1 month ago

In particular, these extractors should be scoped per-session

What do you mean? You mean that it should load all the data corresponding to that session?

Interfaces in neuroconv should not work like that and an extractor making that assumption would make things difficult.

Is that a decision that you guys made or are you talking about some specific knowledge about this format?

weiglszonja commented 1 month ago

In particular, these extractors should be scoped per-session, but it looks like this one takes a folder path with multiple sessions (dates), so it probably needs to be changed.

@pauladkisson the example from the Tye lab that we have on gin is still a single session; just saved to multiple folders. apologise if that wasn't clear.

pauladkisson commented 1 month ago

he example from the Tye lab that we have on gin is still a single session; just saved to multiple folders. apologise if that wasn't clear.

My bad, for some reason 15_03_28 registered in my mind as a date rather than a time. If all of those subfolders are part of the same session then my comment makes much less sense.

h-mayorquin commented 1 month ago

Ok, I managed to found the specification in the Miniscope page and surprise... it does not correspond to either the structure of the gin data (coming from the Tye Lab) or my current project:

http://miniscope.org/index.php/Data_Acquisition_Software

So what gives? I will ask the experimenter in our next meeting if he did change the file structure to try to get to the bottom of this.

h-mayorquin commented 1 month ago

OK, there is also a versioning issue. I will push a fix.

h-mayorquin commented 3 weeks ago

I discussed this with @weiglszonja this morning. There are other problems with the current implementation:

Here is a json miniscope metadata from the example data:

{
    "compression": "FFV1",
    "deviceDirectory": "C:/mData/2021_10_07/C6-J588_Disc5/15_06_28/Miniscope",
    "deviceID": 2,
    "deviceName": "Miniscope",
    "deviceType": "Miniscope_V3",
    "frameRate": "15FPS",
    "framesPerFile": 1000,
    "gain": "High",
    "led0": 47
}

But the structure is that of the version 4 according to the extension: https://github.com/catalystneuro/ndx-miniscope

So, Szonja and I agree that is probably better to create a new extractor takes as an input the timestamp folder and is specific for v4. We can then think on how to integrate these two modalities in the future.

h-mayorquin commented 3 weeks ago

Also relevant: https://github.com/Aharoni-Lab/Miniscope-DAQ-QT-Software/issues/60

h-mayorquin commented 2 weeks ago

Here, yet another folder structure reported: https://github.com/catalystneuro/ndx-miniscope/issues/19#issuecomment-2419285413