FIP timeseries is incorrectly aligned

alexpiet commented 3 months ago

Looking at the NWB files generated from the pipeline there are two issues with time alignment

The R channel timestamps start ~30s after the G channel. They should be synchronized

The G,R,Iso timestamps are not synchronized (likely a result of the first issue)

alexpiet commented 3 months ago

~~Done enough debugging to convince myself the data looks fine after NWB-Packaging-FiberPhotometry-Base-Capsule, so the issue must be in the aind-fip-dff capsule~~

I found the origin of the problem. In in the capsule NWB-Packaging-FiberPhotometry-Base-Capsule, the file code/util/aligned_fiber_to_nwb.py, the function clean_timestamps_Harp()

Here is a script that demonstrates the problem. You need to change raw_data_dir to point to a raw data asset.

from pipeline.aligned_fiber_to_nwb import * 
# I made a package out of the utils in this capsule. The fact that these functions arent in libraries makes debugging much harder

# Change this path
raw_data_dir = '/Users/alex.piet/data/raw_data/behavior_717531_2024-08-07_09-01-02'

fiber_photometry_system, filenames = detect_fiber_photometry_system(raw_data_dir)
behavior_system = detect_behavior_recording_system(raw_data_dir)
df_fip_ses_cleaned = load_Homebrew_fip_data(filenames, fibers_per_file=2)

# everything looks good here!
df_fip_ses_cleaned.sort_values(by='time_fip').head(10)

timestamps_Harp_cleaned = clean_timestamps_Harp(raw_data_dir)
df_fip_ses_aligned = alignment_fip_time_to_harp(df_fip_ses_cleaned, timestamps_Harp_cleaned)           

# problem here! 
df_fip_ses_aligned.sort_values(by='time_fip').head(10)
df_fip_ses_aligned.sort_values(by='time').head(10)

output here. Note that when sorting by time_fip, we see timestamps from two G channels, then two Iso channels, then two R channels - this is the correct interleaving we expect. However, after clean_timestamps_Harp, the time column timestamps incorrectly have the G timestamps start before. Screenshot 2024-08-12 at 12 38 13 PM

I dont know what the clean_timestamps_Harp function does, or who wrote it.

hagikent commented 3 months ago

this might be related to the handling of the "10 fist blank frames" that we used to have in the past

alexpiet commented 3 months ago

Note that debugging this code is very annoying because of this issues outlined here: https://github.com/AllenNeuralDynamics/aind-physio-arch/issues/45#issuecomment-2276796729

rachelstephlee commented 3 months ago

Todo for rachel:

Write a testing function that will eventually be integrated into the QC code.

alexpiet commented 3 months ago

@srecanatesi gives this fix, editing line 460 of "aligned_fiber_to_nwb.py" which reads:

        timestamps[channel] = df_fip_sel[df_fip_sel['channel'] == channel]['time_fip'].values

to

        timestamps_channel = df_fip_sel[df_fip_sel['channel'] == channel]['time_fip'].values
        conversion_factor = np.round(statistics.mode(np.round(np.diff(timestamps_channel),2))/0.05)
        power = 10**round(np.log10(abs(conversion_factor)))
        timestamps[channel] = timestamps_channel / power

This appears to resolve the issue. Remaining tasks:

[x] add this line to the pipeline
[ ] Add a QC check to look for this issue in the future
[ ] Test on more sessions

alexpiet commented 3 months ago

PR to update the pipeline. https://github.com/AllenNeuralDynamics/NWB-Packaging-FiberPhotometry-Base-Capsule/pull/14
I'm struggling to test it on CO, because I'm not familiar with CO tools.

alexpiet commented 3 months ago

G, Iso, R timestamps are interleaved, hooray!
But, the FIP rising and falling timestamps are oddly aligned.

hagikent commented 3 months ago

If easily debugable, I wouldn't discourage to do so. However, practically, we can completely drop the "falling" from nwb or dataframe.

Rationale: -rising is corresponding to the starting time of Green channel camera frame exposure -falling is corresponding to the ending time of Green channel camera frame exposure -Green channel one data point has only one (Bonsai-software) timestamp and thus only one HARP-aligned timestamp. -thus, the only one additional info we have by keeping falling is the exposure time (diff between rising and falling)

alexpiet commented 3 months ago

Making a list of sessions where the alignment is to ISO, and not G1

BAD: behavior_717531_2024-08-07_09-01-02_processed_2024-08-08_13-34-22 (but still has 30 second offset issue) behavior_717531_2024-08-14_09-20-16_processed_2024-08-15_08-24-47(but still has 30 second offset issue)

GOOD: behavior_711042_2024-08-23_13-27-08_processed_2024-08-24_07-39-12 behavior_717531_2024-08-16_09-11-56_processed_2024-08-17_22-12-16 behavior_717531_2024-08-23_08-58-49_processed_2024-08-24_10-37-56 behavior_732163_2024-08-23_13-36-21_processed_2024-08-24_06-57-41

I can't replicate the alignment to ISO in any sessions without the 30 second offset issue. The asset where I noticed this issue had been reprocessed off pipeline while we were fixing the 30 second offset issue. Lets close this issue, and we can reopen if we notice alignment to Iso.

AllenNeuralDynamics / aind-fip-dff

FIP timeseries is incorrectly aligned #5