Details regarding processing multiple tiff files

PrasannaKumaran commented 1 year ago

Hey. I have a doubt regarding processing multiple TIFF files and couldn't find the answer I am looking for in the documentation. I have multiple TIFF files in a folder. I am using the suite2p GUI, and I chose the data path to be the folder with multiple .tiff files. I want to know if Suite2P stacks the .tiff files and processes them as a single file. If it processes them separately, how do I store the generated.npy files for each tiff file separately? Thank you. I would appreciate it if someone could help me with this.

jmdelahanty commented 1 year ago

I believe that suite2p takes whatever input data you have and converts it into a binary that's read/analyzed through their algorithms. I don't believe it "stacks" so much as converts it into a new format and then processes the whole binary in chunks. It should make just one big .npy file that you can index.

PrasannaKumaran commented 1 year ago

What I did was put each of the tiff files in separate folders and also together in one folder. The ROIs detected for each individual file made sense. When the combined folder was processed, I got so many more ROIs that were detected. That's why was wondering if it was stacking and processing the tiff files as one file. Did I do something wrong here? Also, thanks for the response!

jmdelahanty commented 1 year ago

No problem! I hope we solve it together!

Do you have multi-page tiffs or single page tiffs?

I'm not sure why it would matter since it's just reading in the tiff information into a binary. The only thing I can think of honestly is that the settings between your runs were somehow slightly different. It would be odd if reading in the data from different formats/directories would be a cause for the data to be very different and give you super different ROIs.

It would probably be helpful to see what you mean by "so many more ROIs" also.

PrasannaKumaran commented 1 year ago

Okay let me explain once again what the situation is. movie15 This corresponds to movie 1 and movie22 corresponds to movie 2. Each movie was stored in separate folders and using the GUI, and choosing the individual folders I got this.

As you can see, movie 1 has 39 cells and 19 not cells and movie 2 has 23 cells and 17 not cells.

What I did next is store both the movies in the same folder and chose this folder for processing in the GUI. The result is shown below and let this be movie 3. combinedFolder

This contains 29 cells and 78 not cells. Also in the line plot below, there are 1200 frames for movie 1 and movie 2. Movie 3 has 2400 frames. My doubt is that in the case where both the movies are present in the same folder, whether the frames are stacked (1200 + 1200 = 2400) and the resulting plot is obtained. I would like to know how to keep all the tiff files in the same folder, process them separately and obtain .bin/.mat files separately for each processed file in the GUI if its possible. Else can it can be done only using through Python code.

jmdelahanty commented 1 year ago

I think I see what you mean here now, thanks for describing it!

As far as I know, suite2p just finds any/all tif or whatever file extension you're using here and then runs over all those. I'm not sure if you can keep things like this and process them as you're describing within the GUI. You might have to have a post-processing step that moves all your tifs into one directory or just script your processing of suite2p.

Here's an example of how to do something like that from the README and also the Jupyter Notebook example.

I think would basically want to do something like the third code block in the notebook. So the relevant parts would be something like (I haven't run this so I don't know if it'll actually work...):

import numpy as np
import sys
import os
import glob

from suite2p import run_s2p, default_ops

# set your options for running; otherwise I think you just give the path to an `.ops` file
# for a particular recording or have a base set of ops that you populate the db with
# later.
ops = default_ops() # populates ops with the default options

# make a list of db's and loop over them
db = []
db.append({'data_path': ['C:/Users/carse/github/tiffs']})
db.append({'data_path': ['C:/Users/carse/github/tiffs2']})

# Could also do something like this instead probably so you don't have to handwrite paths...
# or give a full "basepath" inside the glob, or Pathlib is nice too, but you'd have to make sure
# you have the path be a string...
batch = [file for file in glob.glob("*.tif")] # either run the script inside where this data is

# This would add arbitrary numbers of tiffs that could be run separately I guess...
# You'll also want to add a new save location I think for these so you have all the tiffs
# in one place, but each processed output should have it's own directory so they're
# separate. I think suite2p names the files the same per run...
for file in batch:
    filename = os.path.split(file)[1]
    save_path = f'{filename}_output'
    db.append({'data_path': [str(file)], 'save_folder': save_path}

# Now the db is constructed, so you run suite2p one after another (if I'm understanding the code/example correctly)
for dbi in db:
    opsEnd = run_s2p(ops=ops, db=dbi)

Double disclaimer, not sure if this will work.

PrasannaKumaran commented 1 year ago

Hey, thanks a lot for the help. Appreciate it. I'll go through the codeblock you suggested and let you know if it works!

Update: It works. I added the folders to list and now processed files are generated inside the corresponding folders and I am able to load them individually on the GUI. Using the 3rd cell block on the jupyter notebook alone is sufficient I guess (atleast in my case). So yes, I don't think using the GUI would help in this case. Thanks a lot for your time and helping me out! @jmdelahanty 👍

jmdelahanty commented 1 year ago

Nice! If you feel it's all worked out, it can be helpful to close the Issue so it's not on the dev team's radar.

MouseLand / suite2p

Details regarding processing multiple tiff files #986