AllenNeuralDynamics / aind-fip-dff

This capsule takes an input raw NWB file of fiber data and creates processed DFF traces
MIT License
0 stars 0 forks source link

FIP timeseries is incorrectly aligned #5

Closed alexpiet closed 3 months ago

alexpiet commented 3 months ago

Looking at the NWB files generated from the pipeline there are two issues with time alignment

Image

Image

alexpiet commented 3 months ago

Done enough debugging to convince myself the data looks fine after NWB-Packaging-FiberPhotometry-Base-Capsule, so the issue must be in the aind-fip-dff capsule

I found the origin of the problem. In in the capsule NWB-Packaging-FiberPhotometry-Base-Capsule, the file code/util/aligned_fiber_to_nwb.py, the function clean_timestamps_Harp()

Here is a script that demonstrates the problem. You need to change raw_data_dir to point to a raw data asset.

from pipeline.aligned_fiber_to_nwb import * 
# I made a package out of the utils in this capsule. The fact that these functions arent in libraries makes debugging much harder

# Change this path
raw_data_dir = '/Users/alex.piet/data/raw_data/behavior_717531_2024-08-07_09-01-02'

fiber_photometry_system, filenames = detect_fiber_photometry_system(raw_data_dir)
behavior_system = detect_behavior_recording_system(raw_data_dir)
df_fip_ses_cleaned = load_Homebrew_fip_data(filenames, fibers_per_file=2)

# everything looks good here!
df_fip_ses_cleaned.sort_values(by='time_fip').head(10)

timestamps_Harp_cleaned = clean_timestamps_Harp(raw_data_dir)
df_fip_ses_aligned = alignment_fip_time_to_harp(df_fip_ses_cleaned, timestamps_Harp_cleaned)           

# problem here! 
df_fip_ses_aligned.sort_values(by='time_fip').head(10)
df_fip_ses_aligned.sort_values(by='time').head(10)

output here. Note that when sorting by time_fip, we see timestamps from two G channels, then two Iso channels, then two R channels - this is the correct interleaving we expect. However, after clean_timestamps_Harp, the time column timestamps incorrectly have the G timestamps start before. Screenshot 2024-08-12 at 12 38 13 PM

I dont know what the clean_timestamps_Harp function does, or who wrote it.

hagikent commented 3 months ago

this might be related to the handling of the "10 fist blank frames" that we used to have in the past

alexpiet commented 3 months ago

Note that debugging this code is very annoying because of this issues outlined here: https://github.com/AllenNeuralDynamics/aind-physio-arch/issues/45#issuecomment-2276796729

rachelstephlee commented 3 months ago

Todo for rachel:

Write a testing function that will eventually be integrated into the QC code.

alexpiet commented 3 months ago

@srecanatesi gives this fix, editing line 460 of "aligned_fiber_to_nwb.py" which reads:

        timestamps[channel] = df_fip_sel[df_fip_sel['channel'] == channel]['time_fip'].values

to

        timestamps_channel = df_fip_sel[df_fip_sel['channel'] == channel]['time_fip'].values
        conversion_factor = np.round(statistics.mode(np.round(np.diff(timestamps_channel),2))/0.05)
        power = 10**round(np.log10(abs(conversion_factor)))
        timestamps[channel] = timestamps_channel / power

This appears to resolve the issue. Remaining tasks:

alexpiet commented 3 months ago
alexpiet commented 3 months ago

Image

hagikent commented 3 months ago

If easily debugable, I wouldn't discourage to do so. However, practically, we can completely drop the "falling" from nwb or dataframe.

Rationale: -rising is corresponding to the starting time of Green channel camera frame exposure -falling is corresponding to the ending time of Green channel camera frame exposure -Green channel one data point has only one (Bonsai-software) timestamp and thus only one HARP-aligned timestamp. -thus, the only one additional info we have by keeping falling is the exposure time (diff between rising and falling)

alexpiet commented 3 months ago

Making a list of sessions where the alignment is to ISO, and not G1

BAD: behavior_717531_2024-08-07_09-01-02_processed_2024-08-08_13-34-22 (but still has 30 second offset issue) behavior_717531_2024-08-14_09-20-16_processed_2024-08-15_08-24-47(but still has 30 second offset issue)

GOOD: behavior_711042_2024-08-23_13-27-08_processed_2024-08-24_07-39-12 behavior_717531_2024-08-16_09-11-56_processed_2024-08-17_22-12-16 behavior_717531_2024-08-23_08-58-49_processed_2024-08-24_10-37-56 behavior_732163_2024-08-23_13-36-21_processed_2024-08-24_06-57-41

I can't replicate the alignment to ISO in any sessions without the 30 second offset issue. The asset where I noticed this issue had been reprocessed off pipeline while we were fixing the 30 second offset issue. Lets close this issue, and we can reopen if we notice alignment to Iso.