AllenInstitute / ophys_etl_pipelines

Pipelines and modules for processing optical physiology data
Other
9 stars 5 forks source link

Pin pytorch requirement #620

Closed morriscb closed 10 months ago

morriscb commented 10 months ago

Pin torch to 1.13.1.

The version of suite2p we are using grabs torch version >=1.7.1. There is a more recent version that seems to be causing the issue. Subsequent suite2p releases (latest is 0.14 ours is 0.10.2) pin the version of torch to a similar value. Attempted to update suite2p to 0.14 and 0.11 but found failures in testing one of which is non trivial. Suggest deploying this for now and creating a ticket to bump the suite2p version to 0.14 so we can unpin in our requirements.

codecov[bot] commented 10 months ago

Codecov Report

Merging #620 (747a8de) into main (51166ba) will increase coverage by 0.14%. The diff coverage is n/a.

Impacted file tree graph

@@            Coverage Diff             @@
##             main     #620      +/-   ##
==========================================
+ Coverage   96.13%   96.27%   +0.14%     
==========================================
  Files         105      105              
  Lines        7944     8249     +305     
  Branches      775      775              
==========================================
+ Hits         7637     7942     +305     
  Misses        229      229              
  Partials       78       78              
Flag Coverage Δ
general_tests 96.27% <ø> (+0.14%) :arrow_up:

Flags with carried forward coverage won't be shown. Click here to find out more.

see 83 files with indirect coverage changes

aamster commented 10 months ago

I dug into it also. The issue is a conflict with

deepinterpolation @ git+https://github.com/danielsf/deepinterpolation@staging/cleaner/ophys_etl

this installs tensorflow 2.5.0 which is pretty old, and has typing_extensions pinned to ~=3.7 https://github.com/tensorflow/tensorflow/blob/v2.5.0/tensorflow/tools/pip_package/setup.py#L93C5-L93C33

this conflicts with pytorch which is trying to install the latest typing_extensions. You can see that in the build output.

Only typing_extensions >= 3.10 apparently has the ParamSpec https://github.com/optuna/optuna/issues/3558

Another fix would be to instead of using Scott's fork of deepinterpolation (it's not even being used in the main branch ?) you could change it to deepinterpolation @ git+https://github.com/AllenInstitute/deepinterpolation_ai_internal which has tensorflow unpinned. More recent versions of tensorflow have a more relaxed constraint on that library: https://github.com/tensorflow/tensorflow/blob/v2.14.0/tensorflow/tools/pip_package/setup.py#L103

Sorry that was a long comment.

morriscb commented 10 months ago

Not at all. Sounds like a better solution. Just changed the deepinterpolation requirement and seeing what happens.

morriscb commented 10 months ago

Still getting the crash on 3.8. I'll keep the switch from Scott's fork and also pin torch and see if that fixes.

aamster commented 10 months ago

It looked like a different issue? You might to add https://github.com/AllenInstitute/ophys_etl_pipelines/blob/workflow_v2/src/ophys_etl/workflows/Dockerfile#L19 to the Dockerfile