Different parts of this repo need different versions of python

klaragerlei commented 3 years ago

Is your feature request related to a problem? Please describe. The pipeline uses python 3.6 and the shuffled analysis uses 3.8, so the data frame outputs of these two are not compatible, because pyhton 3.6 cannot open 3.8 pickles. This problem can be managed by having multiple virtual environments on Eleanor.

Describe the solution you'd like Update the pipeline to use 3.8

Describe alternatives you've considered Keep using the workaround. I think this will cause a lot of issues for less experienced users.

4iar commented 3 years ago

Could this be solved by specifying the max pickler-protocol in the shuffled-analysis code, so that it saves dataframes that are backwards compatible with the 3.6 pipeline?

e.g. df.to_pickle('cat.pkl', protocol=4)

https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.to_pickle.html

Protocol version 4 was added in Python 3.4. It adds support for very large objects, pickling more kinds of objects, and some data format optimizations. It is the default protocol starting with Python 3.8. Refer to PEP 3154 for information about improvements brought by protocol 4.

Protocol version 5 was added in Python 3.8. It adds support for out-of-band data and speedup for in-band data. Refer to PEP 574 for information about improvements brought by protocol 5.

From: https://docs.python.org/3/library/pickle.html

This would only affect newly saved dataframes but you could write a quick 3.8 script to glob, load, and re-save your dataframes using protocol 4

(Python 3.8 does have the walrus operator so it would be nice to upgrade someday anyway...)

:=

klaragerlei commented 3 years ago

df.to_pickle('cat.pkl', protocol=4)

I like this idea. @HDClark94 , is there any reason for using protocol 5, or would it be okay to change this?

MattNolanLab / in_vivo_ephys_openephys

Different parts of this repo need different versions of python #316