deisseroth-lab / two-photon

Common scripts, libraries, and utilities for 2p experiments
5 stars 6 forks source link

Ripper doesn't wait for voltage files / BOT files #9

Open tbenst opened 3 years ago

tbenst commented 3 years ago

@drinnenb or @chrisroat have you seen this error before? ``

python ~/code/two-photon/two-photon/process.py --input_dir /scratch/b115/ --output_dir /scratch/b115/process-output/ --recording 2021-03-16_h2b6s/fish1:TSeries_64cell_8concurrent_2power_8rep-207 --preprocess 2021-03-18 10:30:45.064 metadata:22 INFO Extracting metadata from xml files: /scratch/b115/2021-03-16_h2b6s/fish1/TSeries_64cell_8concurrent_2power_8rep-207/TSeries_64cell_8concurrent_2power_8rep-207.xml /scratch/b115/2021-03-16_h2b6s/fish1/TSeries_64cell_8concurrent_2power_8rep-207/TSeries_64cell_8concurrent_2power_8rep-207_Cycle00001_VoltageRecording_001.xml 2021-03-18 10:30:47.161 metadata:102 INFO The following metadata is written to: /scratch/b115/process-output/2021-03-16_h2b6s/fish1/TSeries_64cell_8concurrent_2power_8rep-207/output/metadata.json {'channels': {0: {'enabled': True, 'name': 'frame starts'}, 1: {'enabled': True, 'name': 'secondary'}, 2: {'enabled': True, 'name': 'winfluo'}, 3: {'enabled': True, 'name': 'Blue'}, 4: {'enabled': True, 'name': 'VR timestamps'}, 5: {'enabled': True, 'name': 'green'}, 6: {'enabled': True, 'name': 'LED'}, 7: {'enabled': True, 'name': 'respir'}}, 'laser': {'power': None, 'wavelength': None}, 'layout': {'frames_per_sequence': 77378, 'sequences': 1}, 'optical_zoom': 1.0, 'period': 0.033216582, 'size': {'channels': 2, 'frames': 77378, 'x_px': 512, 'y_px': 512, 'z_planes': 1}} 2021-03-18 10:30:47.265 process:92 INFO Found stim channel "respir", enabled=True Traceback (most recent call last): File "/home/tyler/code/two-photon/two-photon/process.py", line 353, in main() File "/home/tyler/code/two-photon/two-photon/process.py", line 114, in main preprocess(basename_input, dirname_output, fname_csv, fname_uncorrected_hdf5, fname_hdf5, mdata, File "/home/tyler/code/two-photon/two-photon/process.py", line 160, in preprocess df_voltage = pd.read_csv(fname_csv, index_col='Time(ms)', skipinitialspace=True) File "/home/tyler/opt/anaconda3/lib/python3.8/site-packages/pandas/io/parsers.py", line 686, in read_csv return _read(filepath_or_buffer, kwds) File "/home/tyler/opt/anaconda3/lib/python3.8/site-packages/pandas/io/parsers.py", line 458, in _read data = parser.read(nrows) File "/home/tyler/opt/anaconda3/lib/python3.8/site-packages/pandas/io/parsers.py", line 1196, in read ret = self._engine.read(nrows) File "/home/tyler/opt/anaconda3/lib/python3.8/site-packages/pandas/io/parsers.py", line 2231, in read index, names = self._make_index(data, alldata, names) File "/home/tyler/opt/anaconda3/lib/python3.8/site-packages/pandas/io/parsers.py", line 1677, in _make_index index = self._agg_index(index) File "/home/tyler/opt/anaconda3/lib/python3.8/site-packages/pandas/io/parsers.py", line 1770, in _aggindex arr, = self._infer_types(arr, col_na_values | col_na_fvalues) File "/home/tyler/opt/anaconda3/lib/python3.8/site-packages/pandas/io/parsers.py", line 1871, in _infer_types mask = algorithms.isin(values, list(na_values)) File "/home/tyler/opt/anaconda3/lib/python3.8/site-packages/pandas/core/algorithms.py", line 443, in isin if np.isnan(values).any(): TypeError: ufunc 'isnan' not supported for the input types, and the inputs could not be safely coerced to any supported types according to the casting rule ''safe''

drinnenb commented 3 years ago

No, sorry!

-- Antonia Drinnenberg, PhD Postdoctoral fellow Deisseroth Lab | Dept. of Bioengineering Stanford University +1-650 285 8705

On 18 Mar 2021, at 10:34, Tyler Benster @.**@.>> wrote:

@drinnenbhttps://github.com/drinnenb or @chrisroathttps://github.com/chrisroat have you seen this error before? ``

python ~/code/two-photon/two-photon/process.py --input_dir /scratch/b115/ --output_dir /scratch/b115/process-output/ --recording 2021-03-16_h2b6s/fish1:TSeries_64cell_8concurrent_2power_8rep-207 --preprocess 2021-03-18 10:30:45.064 metadata:22 INFO Extracting metadata from xml files: /scratch/b115/2021-03-16_h2b6s/fish1/TSeries_64cell_8concurrent_2power_8rep-207/TSeries_64cell_8concurrent_2power_8rep-207.xml /scratch/b115/2021-03-16_h2b6s/fish1/TSeries_64cell_8concurrent_2power_8rep-207/TSeries_64cell_8concurrent_2power_8rep-207_Cycle00001_VoltageRecording_001.xml 2021-03-18 10:30:47.161 metadata:102 INFO The following metadata is written to: /scratch/b115/process-output/2021-03-16_h2b6s/fish1/TSeries_64cell_8concurrent_2power_8rep-207/output/metadata.json {'channels': {0: {'enabled': True, 'name': 'frame starts'}, 1: {'enabled': True, 'name': 'secondary'}, 2: {'enabled': True, 'name': 'winfluo'}, 3: {'enabled': True, 'name': 'Blue'}, 4: {'enabled': True, 'name': 'VR timestamps'}, 5: {'enabled': True, 'name': 'green'}, 6: {'enabled': True, 'name': 'LED'}, 7: {'enabled': True, 'name': 'respir'}}, 'laser': {'power': None, 'wavelength': None}, 'layout': {'frames_per_sequence': 77378, 'sequences': 1}, 'optical_zoom': 1.0, 'period': 0.033216582, 'size': {'channels': 2, 'frames': 77378, 'x_px': 512, 'y_px': 512, 'z_planes': 1}} 2021-03-18 10:30:47.265 process:92 INFO Found stim channel "respir", enabled=True Traceback (most recent call last): File "/home/tyler/code/two-photon/two-photon/process.py", line 353, in main() File "/home/tyler/code/two-photon/two-photon/process.py", line 114, in main preprocess(basename_input, dirname_output, fname_csv, fname_uncorrected_hdf5, fname_hdf5, mdata, File "/home/tyler/code/two-photon/two-photon/process.py", line 160, in preprocess df_voltage = pd.read_csv(fname_csv, index_col='Time(ms)', skipinitialspace=True) File "/home/tyler/opt/anaconda3/lib/python3.8/site-packages/pandas/io/parsers.py", line 686, in read_csv return _read(filepath_or_buffer, kwds) File "/home/tyler/opt/anaconda3/lib/python3.8/site-packages/pandas/io/parsers.py", line 458, in _read data = parser.read(nrows) File "/home/tyler/opt/anaconda3/lib/python3.8/site-packages/pandas/io/parsers.py", line 1196, in read ret = self._engine.read(nrows) File "/home/tyler/opt/anaconda3/lib/python3.8/site-packages/pandas/io/parsers.py", line 2231, in read index, names = self._make_index(data, alldata, names) File "/home/tyler/opt/anaconda3/lib/python3.8/site-packages/pandas/io/parsers.py", line 1677, in _make_index index = self._agg_index(index) File "/home/tyler/opt/anaconda3/lib/python3.8/site-packages/pandas/io/parsers.py", line 1770, in _aggindex arr, = self._infer_types(arr, col_na_values | col_na_fvalues) File "/home/tyler/opt/anaconda3/lib/python3.8/site-packages/pandas/io/parsers.py", line 1871, in _infer_types mask = algorithms.isin(values, list(na_values)) File "/home/tyler/opt/anaconda3/lib/python3.8/site-packages/pandas/core/algorithms.py", line 443, in isin if np.isnan(values).any(): TypeError: ufunc 'isnan' not supported for the input types, and the inputs could not be safely coerced to any supported types according to the casting rule ''safe''

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHubhttps://github.com/deisseroth-lab/two-photon/issues/9, or unsubscribehttps://github.com/notifications/unsubscribe-auth/ALGFJ7GKYLJ5TBAFM2CXXB3TEI2TXANCNFSM4ZNCGEBQ.

tbenst commented 3 years ago

The problem is that the ripping of VoltageRecording, as well as BOT, happens after the tiff ripping, so our script is killing the ripper prematurely. Temporary fix is to dramatically increase https://github.com/deisseroth-lab/two-photon/blob/32cc9d5cf2e705949dfb82581c7f0c3ae5f9e50f/two-photon/rip.py#L19 to 36000

chrisroat commented 3 years ago

Gotcha. When I put the script together, I don't think I knew about that aspect of the ripping.

The ripper doesn't exit cleanly, so we have to monitor its inputs/outputs to guess when it's done and then kill it. We consider it done once:

What needs to be checked to know the VoltageRecording and BOT are done?

C

On Fri, Mar 19, 2021 at 5:01 AM Tyler Benster @.***> wrote:

The problem is that the ripping of VoltageRecording, as well as BOT, happens after the tiff ripping, so our script is killing the ripper prematurely. Temporary fix is to dramatically increase https://github.com/deisseroth-lab/two-photon/blob/32cc9d5cf2e705949dfb82581c7f0c3ae5f9e50f/two-photon/rip.py#L19 to 36000

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/deisseroth-lab/two-photon/issues/9#issuecomment-802295689, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAIBDYJZNQRSLVFVWKOLM2TTEJSZVANCNFSM4ZNCGEBQ .

tbenst commented 3 years ago

Not your fault at all! I don’t think I gave you example files that had voltage, an oversight. will post some info a bit later

chrisroat commented 3 years ago

Thanks!

On Fri, Mar 19, 2021 at 8:09 AM Tyler Benster @.***> wrote:

Not your fault at all! I don’t think I gave you example files that had voltage, an oversight. will post some info a bit later

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/deisseroth-lab/two-photon/issues/9#issuecomment-802403070, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAIBDYJSH4XS4PIGEBM75TLTEKI3RANCNFSM4ZNCGEBQ .

tbenst commented 3 years ago

@chrisroat here's a path on Oak for an experiment that should output VoltageRecording files as well as BOT files (brightness over time; a csv file for region-of-interest fluorescent traces): /oak/stanford/groups/deissero/users/tyler/share/chris/2021-03-30_wt-chrmine_6dpf_h2b6s/fish1/TSeries-28cell-1concurrent-2power-10trial-052

chrisroat commented 3 years ago

Thanks Tyler.

How does one know from looking at the file list that such a file should be generated?

I need to understand generically how this ripper is determining what files to generate. I take it that BOT here is specific to your setup, and not a Prairie View specific thing. What is the name of the csv file generated when you successfully rip this dataset? Is it related to the *VoltageOutput_001.xml file?

Thanks, Chris

On Thu, Apr 1, 2021 at 10:15 AM Tyler Benster @.***> wrote:

@chrisroat https://github.com/chrisroat here's a path on Oak for an experiment that should output VoltageRecording files as well as BOT files (brightness over time; a csv file for region-of-interest fluorescent traces):

/oak/stanford/groups/deissero/users/tyler/share/chris/2021-03-30_wt-chrmine_6dpf_h2b6s/fish1/TSeries-28cell-1concurrent-2power-10trial-052

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/deisseroth-lab/two-photon/issues/9#issuecomment-811587608, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAIBDYKQW6WONRR4DXJW4W3TGPJNPANCNFSM4ZNCGEBQ .

tbenst commented 3 years ago

VoltageOutput is a separate functionality from VoltageRecording--the former generates TTLs, the latter records digitizes / records voltage.

There are a few relevant files that tell us a voltage recording will be created, namely:

TSeries-28cell-1concurrent-2power-10trial-052_Cycle00001_VoltageRecording_001
TSeries-28cell-1concurrent-2power-10trial-052_Cycle00001_VoltageRecording_001_VRFilelist.txt
TSeries-28cell-1concurrent-2power-10trial-052_Cycle00001_VoltageRecording_001.xml

After ripping, there should be a corresponding TSeries-28cell-1concurrent-2power-10trial-052_Cycle00001_VoltageRecording_001.csv file (one per Cycle).

In another recording with 8 trials, there are 8x these files: /oak/stanford/groups/deissero/users/tyler/b115/2021-03-16_rschrmine_h2b6s/fish3. When ripped, this recording also generates 8 x TSeries_cross-stim_p125_100x100-310_Cycle00001-botData.csv files, with an incrementing number following Cycle.

BOT is a Prairie View functionality that we use typically for online monitoring of experiments, so it's not as critical that we get this one right.

Edit: note that Cycle00001 does not always start at 1! but it does always increment by 1

chrisroat commented 3 years ago

Yeah, we figured out the VoltageRecording. I think it will be easy to wait for its output csv to appear. Thanks for the tip on having multiple cycles.

I was curious if VoltageOutput means there is going to be a file to wait for?

For the botData, it's not clear how to tell apriori it will be there and to wait for it. None of the previous datasets I looked at have it. Perhaps I can dig through the xml files and find something.

tbenst commented 3 years ago

No, VoltageOutput does not mean there will be a file to wait for AFAICT. I think all the info is already in the xml file so no ripping needed. The BOT is a weird one. I'm not sure where it's stored--perhaps it comes out of the binary blob that stores the tiffs

Your intuition is spot on though for BOT in the xml files, I found: <PVBOTs botData="TSeries-lrhab_raphe_stim-40trial-038_Cycle00026-botData.csv"> inside of /data/dlab/b115/2020-10-28_elavl3-chrmine-Kv2.1_h2b6s_8dpf/fish1/TSeries-lrhab_raphe_stim-40trial-038/TSeries-lrhab_raphe_stim-40trial-038.xml for example

jmdelahanty commented 2 years ago

Hello everyone! I was curious about your use of the BOT files. Currently, our recordings don't use the BOT function. What do you use it for? Is there something that we're missing out on if we don't have that data?

tbenst commented 2 years ago

BOT= brightness over time. It’s a way of drawing ROIs in the bruker software so you can monitor an experiment

jmdelahanty commented 2 years ago

That makes sense! So basically you select a neuron/group of neurons that you're particularly interested in as an ROI or something and go from there?

And also, since I have you here, do you do any behavior while you're recording from the mice? How does your lab record behavior data/stimulate the brain at once? My post-doc mentor wants to simultaneously record from the brain and stimulate at the same time during the behavior session all in one long recording. Do you make a new t-series for each trial or something similar? The Bruker documentation hasn't been super helpful in getting this going.

If I could meet with one of you at some point over zoom it would be immensely helpful and such a privilege to learn from you!

tbenst commented 2 years ago

So basically you select a neuron/group of neurons that you're particularly interested in as an ROI or something and go from there?

Yes

And also, since I have you here, do you do any behavior while you're recording from the mice? How does your lab record behavior data/stimulate the brain at once?

I'm part of Team Fish :), but yes concurrent behavior & stim is a common paradigm, usually taking advantage of the analog inputs for synchronization TTL signals, and behavior-specific software that saves its own files.

If I could meet with one of you at some point over zoom it would be immensely helpful and such a privilege to learn from you!

Sure, I'm happy to meet and talk through / share whatever I can that's helpful. let's lock down a time over email? tbenst at stanford edu

jmdelahanty commented 2 years ago

Potential way to solve this for at least voltage files is to poll the filesize with os.stat(path).st_size, something like this:

csv_size = os.stat(path).st_size

time.sleep(10)

new_csv_size = os.stat(path).st_size

if csv_size == new_csv_size:
    logging.info("CSV Conversion Complete")

else:
    logging.info("CSV Still convering...")

I'm in the process of adding this to the container now to see what it does. But this way at least you won't have to worry about keeping the ripper running for a long time. In our file naming scheme, it will always do the csv conversion first because the filelist.txt has a digit in it (the date) for the voltage recording. The imaging filelist.txt is just called Cycle whatever.

Update to this:

It works properly if you include it as part of the rip.py script. I've modified what you've created for the repo for our lab's structure, but you can see how I implemented what you made here.