juliencarponcy / trialexp

MIT License
2 stars 4 forks source link

xarray workflow #13

Closed teristam closed 1 year ago

teristam commented 1 year ago

DRAFT only, may not work

This is a draft PR to track the progress of the xarray refactor. More details will be added soon.

This PR contains some refactors of the snakemake pipeline so that it works better when multiple sessions of recordings are analyzed together. Specifically, it contains the following the major changes

  1. Instead of using a long relative time coordinate for pyphotometry data, the data is now extracted using trial and event time coordinate. It reduced the issues created by downsampling the relative time (depending on the averaging window, the downsampled coordinate may not be the same for different trials). Now raw data can be downsampled very easily using coarsen without the need to realign the relative time coordinates.

Example dataset format:

image
  1. Trial specific information in df_condition is stored in its own variable with the trial_nb coordinate. This should avoid the need to duplicate those information for each time point. As we are now using trial_nb as one of the coordinate of the photometry data, when we convert the xarray dataset to dataframe, the variables from df_condition will be automatically expanded to each time point, making data plotting very easy.

    image
  2. trialexp/process/pycontrol/event_filters.py contains the function to extract timestamp of different events for aligning the photometry data

  3. A new notebook is added to notebooks/workflow/multi_session_analysis.ipynb to show how to perform multi-session analysis on the pipeline output.

  4. There is also some moving around of imports. So better test it on the existing pipeline first

TODO

Directly averging df/f from multiple animals together probably won't work because they have different baseline of photometry signal. Probably need to use the z-score instead.