juliencarponcy / trialexp

MIT License
2 stars 4 forks source link

Refactoring for multi-session analysis #14

Closed teristam closed 1 year ago

teristam commented 1 year ago

This PR contains some refactors of the snakemake pipeline so that it works better when multiple sessions of recordings are analyzed together. Specifically, it contains the following the major changes

  1. Instead of using a long relative time coordinate for pyphotometry data, the data is now extracted using trial and event time coordinate. It reduced the issues created by downsampling the relative time (depending on the averaging window, the downsampled coordinate may not be the same for different trials). Now raw data can be downsampled very easily using coarsen without the need to realign the relative time coordinates.

Example dataset format:

image
  1. Trial specific information in df_condition is stored in its own variable with the trial_nb coordinate. This should avoid the need to duplicate those information for each time point. As we are now using trial_nb as one of the coordinate of the photometry data, when we convert the xarray dataset to dataframe, the variables from df_condition will be automatically expanded to each time point, making data plotting very easy.

    image
  2. trialexp/process/pycontrol/event_filters.py contains the function to extract timestamp of different events for aligning the photometry data

  3. A new notebook is added to notebooks/workflow/multi_session_analysis.ipynb to show how to perform multi-session analysis on the pipeline output.

  4. There is also some moving around of imports. So better test it on the existing pipeline first

  5. Change in dependencies: I have to bump the Python version to 3.9 because of new xarray requirement. Probably also need to reinstall snakehelper because there is some update upstream.

TODO

Directly averging df/f from multiple animals together probably won't work because they have different baseline of photometry signal. Probably need to use the z-score instead.