Closed alexpiet closed 3 months ago
Done enough debugging to convince myself the data looks fine after NWB-Packaging-FiberPhotometry-Base-Capsule
, so the issue must be in the aind-fip-dff
capsule
I found the origin of the problem. In in the capsule NWB-Packaging-FiberPhotometry-Base-Capsule
, the file code/util/aligned_fiber_to_nwb.py
, the function clean_timestamps_Harp()
Here is a script that demonstrates the problem. You need to change raw_data_dir
to point to a raw data asset.
from pipeline.aligned_fiber_to_nwb import *
# I made a package out of the utils in this capsule. The fact that these functions arent in libraries makes debugging much harder
# Change this path
raw_data_dir = '/Users/alex.piet/data/raw_data/behavior_717531_2024-08-07_09-01-02'
fiber_photometry_system, filenames = detect_fiber_photometry_system(raw_data_dir)
behavior_system = detect_behavior_recording_system(raw_data_dir)
df_fip_ses_cleaned = load_Homebrew_fip_data(filenames, fibers_per_file=2)
# everything looks good here!
df_fip_ses_cleaned.sort_values(by='time_fip').head(10)
timestamps_Harp_cleaned = clean_timestamps_Harp(raw_data_dir)
df_fip_ses_aligned = alignment_fip_time_to_harp(df_fip_ses_cleaned, timestamps_Harp_cleaned)
# problem here!
df_fip_ses_aligned.sort_values(by='time_fip').head(10)
df_fip_ses_aligned.sort_values(by='time').head(10)
output here. Note that when sorting by time_fip
, we see timestamps from two G channels, then two Iso channels, then two R channels - this is the correct interleaving we expect. However, after clean_timestamps_Harp
, the time
column timestamps incorrectly have the G
timestamps start before.
I dont know what the clean_timestamps_Harp
function does, or who wrote it.
this might be related to the handling of the "10 fist blank frames" that we used to have in the past
Note that debugging this code is very annoying because of this issues outlined here: https://github.com/AllenNeuralDynamics/aind-physio-arch/issues/45#issuecomment-2276796729
Todo for rachel:
Write a testing function that will eventually be integrated into the QC code.
@srecanatesi gives this fix, editing line 460 of "aligned_fiber_to_nwb.py" which reads:
timestamps[channel] = df_fip_sel[df_fip_sel['channel'] == channel]['time_fip'].values
to
timestamps_channel = df_fip_sel[df_fip_sel['channel'] == channel]['time_fip'].values
conversion_factor = np.round(statistics.mode(np.round(np.diff(timestamps_channel),2))/0.05)
power = 10**round(np.log10(abs(conversion_factor)))
timestamps[channel] = timestamps_channel / power
This appears to resolve the issue. Remaining tasks:
If easily debugable, I wouldn't discourage to do so. However, practically, we can completely drop the "falling" from nwb or dataframe.
Rationale: -rising is corresponding to the starting time of Green channel camera frame exposure -falling is corresponding to the ending time of Green channel camera frame exposure -Green channel one data point has only one (Bonsai-software) timestamp and thus only one HARP-aligned timestamp. -thus, the only one additional info we have by keeping falling is the exposure time (diff between rising and falling)
Making a list of sessions where the alignment is to ISO, and not G1
BAD: behavior_717531_2024-08-07_09-01-02_processed_2024-08-08_13-34-22 (but still has 30 second offset issue) behavior_717531_2024-08-14_09-20-16_processed_2024-08-15_08-24-47(but still has 30 second offset issue)
GOOD: behavior_711042_2024-08-23_13-27-08_processed_2024-08-24_07-39-12 behavior_717531_2024-08-16_09-11-56_processed_2024-08-17_22-12-16 behavior_717531_2024-08-23_08-58-49_processed_2024-08-24_10-37-56 behavior_732163_2024-08-23_13-36-21_processed_2024-08-24_06-57-41
I can't replicate the alignment to ISO in any sessions without the 30 second offset issue. The asset where I noticed this issue had been reprocessed off pipeline while we were fixing the 30 second offset issue. Lets close this issue, and we can reopen if we notice alignment to Iso.
Looking at the NWB files generated from the pipeline there are two issues with time alignment