AllenInstitute / AllenSDK

code for reading and processing Allen Institute for Brain Science data
https://allensdk.readthedocs.io/en/latest/
Other
343 stars 149 forks source link

Research the DEWARPING_QUEUE in order to upgrade to Python 3 #1130

Closed wbwakeman closed 4 years ago

wbwakeman commented 4 years ago

A critical component of the Scientifica ophys processing pipeline is the step to dewarp the acquired images. More background is available at http://confluence.corp.alleninstitute.org/display/IT/Brain+Observatory+Ophys+Dewarp+module This module needs to be updated to run on Python 3.

We are currently running it using this Python2.7 environment: /shared/bioapps/infoapps/lims2_modules/lib/python/run_python.sh

Would like to run it as a module with the same environment as other AllenSDK modules: /allen/aibs/technology/conda/production/allensdk_py36/

An example of the command that is run is in /allen/programs/braintv/production/visualbehavior/prod3/specimen_920769585/ophys_session_971851886/ophys_experiment_972201499/DEWARPING_QUEUE_972201501.pbs

http://stash.corp.alleninstitute.org/projects/INF/repos/lims2_modules/browse/CAM/sine_dewarp/sine_dewarp.py?at=release

In order to begin this, we need to better understand the state of this code, and map out a plan for upgrading it to use Python3 and to be containerized.

Time-boxed effort limited to 1 - 2 days.

Acceptance criteria:

Matyasz commented 4 years ago

This code looks fairly simple to update to Python 3. Luckily, the only thing it uses from the old version of AllenSDK is the json_utilities, which haven't changed. So that dependency will not be a problem.

As far as the code directly in sine_dewarp.py goes, nothing jumps out at me as anything more than a handful of simple syntax changes (print and raise statements that need parentheses, etc.) Of course I would still recommend having some example input/output pairs that can be rerun after making the update to make sure there aren't any hidden changes deeper within one of your dependencies (namely, numpy and pandas).

As far as a path forward, I would recommend making the few obvious changes to the syntax, then pick on of the input examples mentioned above and try to run it on that. It will likely break, but that will show exactly which APIs have had breaking changes deeper down. I investigated a few (like multiprocessing, os, functools) and haven't found any changes that (at the surface level) will cause any problems.

Also I noticed that there is an outdated version of the AllenSDK json_utilities that was copied into this directory and (at some point) used. This will no longer be possible, since that code is all in Python 2 specific syntax (it could be updated as well but honestly I recommend just deleting it since you have the AllenSDK import now).