choderalab / mtenn

Modular Training and Evaluation of Neural Networks
MIT License
5 stars 1 forks source link

Improve some packaging stuff #10

Closed hmacdope closed 1 year ago

hmacdope commented 1 year ago

Fixes #9 FIxes #8

hmacdope commented 1 year ago

@JenkeScheen @kaminow this should be good for a review now.

Any packaging lookover from @mikemhenry would also be appreciated. Pytorch stuff was a bit tricky, as they don't really emcourage pyproject.toml based builds, so perhaps if someone else could test it (esp on a GPU box) would be helpful.

After this is merged (and assuming we don't want to change the name, close #2 if not) I will make a tagged 0.1.0

hmacdope commented 1 year ago

See here for why some of the pip wheel based builds are complicated

kaminow commented 1 year ago

Testing this on my laptop, which has a GPU. If I first do mamba env create -f environment.yml and then pip install -e ., I get the following error when trying to import mtenn:

Traceback (most recent call last):                                                                             [0/364]
  File "<stdin>", line 1, in <module>
  File "/home/ben/CBM/Chodera_Lab/mtenn/mtenn/mtenn/__init__.py", line 5, in <module>
    from . import conversion_utils
  File "/home/ben/CBM/Chodera_Lab/mtenn/mtenn/mtenn/conversion_utils/__init__.py", line 1, in <module>
    from .schnet import SchNet
  File "/home/ben/CBM/Chodera_Lab/mtenn/mtenn/mtenn/conversion_utils/schnet.py", line 6, in <module>
    from torch_geometric.nn.models import SchNet as PygSchNet
  File "/home/ben/miniconda3/envs/mtenn-env/lib/python3.10/site-packages/torch_geometric/__init__.py", line 4, in <module>
    import torch_geometric.data
  File "/home/ben/miniconda3/envs/mtenn-env/lib/python3.10/site-packages/torch_geometric/data/__init__.py", line 1, in <module>
    from .data import Data
  File "/home/ben/miniconda3/envs/mtenn-env/lib/python3.10/site-packages/torch_geometric/data/data.py", line 20, in <module>
    from torch_sparse import SparseTensor
  File "/home/ben/miniconda3/envs/mtenn-env/lib/python3.10/site-packages/torch_sparse/__init__.py", line 18, in <module>
    torch.ops.load_library(spec.origin)
  File "/home/ben/miniconda3/envs/mtenn-env/lib/python3.10/site-packages/torch/_ops.py", line 255, in load_library
    ctypes.CDLL(path)
  File "/home/ben/miniconda3/envs/mtenn-env/lib/python3.10/ctypes/__init__.py", line 374, in __init__
    self._handle = _dlopen(self._name, mode)
OSError: /home/ben/miniconda3/envs/mtenn-env/lib/python3.10/site-packages/torch_sparse/_version_cuda.so: undefined symbol: _ZN5torch3jit17parseSchemaOrNameERKSs

If instead I create an environment just with python and pip and then try and run pip install -e ., I get this error:

Obtaining file:///home/ben/CBM/Chodera_Lab/mtenn/mtenn
  Installing build dependencies ... done
  Checking if build backend supports build_editable ... done
  Getting requirements to build editable ... done
  Installing backend dependencies ... done
  Preparing editable metadata (pyproject.toml) ... done
Collecting torch-sparse
  Downloading torch_sparse-0.6.16.tar.gz (208 kB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 208.2/208.2 kB 5.7 MB/s eta 0:00:00
  Preparing metadata (setup.py) ... error
  error: subprocess-exited-with-error

  × python setup.py egg_info did not run successfully.
  │ exit code: 1
  ╰─> [6 lines of output]
      Traceback (most recent call last):
        File "<string>", line 2, in <module>
        File "<pip-setuptools-caller>", line 34, in <module>
        File "/tmp/pip-install-gzyrf85v/torch-sparse_98f71645662543ef8d94662b1f1f8be7/setup.py", line 8, in <module>
          import torch
      ModuleNotFoundError: No module named 'torch'
      [end of output]

  note: This error originates from a subprocess, and is likely not a problem with pip.
error: metadata-generation-failed

× Encountered error while generating package metadata.
╰─> See above for output.

note: This is an issue with the package mentioned above, not pip.
hint: See above for details.

Let me know what I can do to help debug

hmacdope commented 1 year ago

@kaminow Ok thats not what I was hoping for, but corresponds with what I have been seeing, may need to switch to pure conda or pure wheel environment managment. TBA

hmacdope commented 1 year ago

@kaminow would you be able to try this again? Do the

 - mamba env create -f environment.yml 
 - pip install -e .

Method, which will be compulsory, I couldn't get a sensible conda environment to work for me and pytorch preinstalled is a requirement of the pip builds of torch-cluster torch-geometric etc.

codecov-commenter commented 1 year ago

Codecov Report

:exclamation: No coverage uploaded for pull request base (main@67a8c5d). Click here to learn what that means. The diff coverage is n/a.

:exclamation: Current head d939bf0 differs from pull request most recent head 09358b0. Consider uploading reports for the commit 09358b0 to get more accurate results

kaminow commented 1 year ago

@hmacdope ok so now everything will install and I can import mtenn stuff, but it's only installing the CPU versions of the pytorch packages

hmacdope commented 1 year ago

@hmacdope ok so now everything will install and I can import mtenn stuff, but it's only installing the CPU versions of the pytorch packages

Right, the issue with trying to install the GPU version is you need to specify a CUDA version or ROCm or whatever. conda install -c pytorch pytorch-cudaDo we need the GPU code to inference or mostly just to train? We can make two environment.yml files one with pytorch-cuda instead? would that be helpful? I'm just not sure it would install on the CI runners or non-gpu platforms etc etc. This is kind of the minimal setup to get it working.

kaminow commented 1 year ago

For the ligand-only graph model, I think inference on the CPU should be OK, but once we start using the 3D models I think we'll have a bad time. I think having two environment files would be fine, one that's CPU-only and hooked up to the CI, and then the other that basically just has the added nvidia channel and the pytorch-cuda dependency (I've tested this and it works on my laptop) for training/3D model inference

mikemhenry commented 1 year ago

Is the goal for this package to be pip installable? What if we just drop the deps out of the pyroject.toml/setup.py and told users to install from conda-forge or (like how you have it) create a conda env then pip install .

mikemhenry commented 1 year ago

Also is there a reason we can't just install all the deps from conda-forge? It can be risky mixing channels because the ABI can be different

hmacdope commented 1 year ago

Also is there a reason we can't just install all the deps from conda-forge? It can be risky mixing channels because the ABI can be different

I'll see if I can move all to CF.

RE py project.toml I thought it was normally good to duplicate the deps there for a PEP 517 compliant install? This hasn't been working too well though due to lack of compatible wheels for PyTorch geometric etc.

If you don't think it's worth it happy to stick to conda only? Let me know.

mikemhenry commented 1 year ago

I don't think it is worth it, IMHO, if pip install . doesn't work to pull everything off of pypi, then we don't really get much mileage out of including any deps in pyproject or setup.py. My preference would be to not have a setup.py, only a pyproject.toml, then submit the package to github.com/conda-forge/staged-recipes/

hmacdope commented 1 year ago

I don't think it is worth it, IMHO, if pip install . doesn't work to pull everything off of pypi, then we don't really get much mileage out of including any deps in pyproject or setup.py. My preference would be to not have a setup.py, only a pyproject.toml, then submit the package to github.com/conda-forge/staged-recipes/

I couldn't manage to get versioneer to work without being able to call its setup functions in setup.py, which is why I added it back in. A little bit annoying I know, as would be better to have only pyproject.toml I agree. I can strip out the deps, but would prefer to keep the automatic versioning?

mikemhenry commented 1 year ago

Ah this repo was made before the cookiecutter got updated, use versioningit and yeet versioneer into the sun

mikemhenry commented 1 year ago

see https://github.com/mikemhenry/demo-project-cookie/blob/main/pyproject.toml#L1-L2 and https://github.com/mikemhenry/demo-project-cookie/blob/main/pyproject.toml#L69-L85 and https://github.com/mikemhenry/demo-project-cookie/blob/main/pyproject.toml#L10

hmacdope commented 1 year ago

Ah okay good, because seemed silly to not be able to support dynamic version from pyproject.toml only. Thanks @mikemhenry

mikemhenry commented 1 year ago

Okay I think those are the 3 bits, you should check the versioningit docs, but I think that is all you need

hmacdope commented 1 year ago

Annoyingly pytorch_scatter and co don't have windows builds on conda-forge and instead we will need to use the equivalent pytorch-scatter builds from pyg channel.

hmacdope commented 1 year ago

Okay I take that back, that causes ABI incompatibility with the linux builds, I'm happy to push windows builds down the track.

mikemhenry commented 1 year ago

Unless we have a compelling reason, we should not worry about supporting windows

mikemhenry commented 1 year ago

If someone really needs it, we can have them use WSL, which should work for like any use case I can think of right now

hmacdope commented 1 year ago

@kaminow sorry for the spam but you happy with changes? If so I will merge and tag a release.

mikemhenry commented 1 year ago

While we are at it, I recommend switching to https://github.com/mamba-org/provision-with-micromamba since it is way faster, see example for the switch here: https://github.com/choderalab/openmmtools/pull/660/files#diff-3ab46ee209a127470fce3c2cf106b1a1dbadbb929a4b5b13656a4bc4ce19c0b8R59-R69 (you don't need the if bit, that is needed so I can use a strict solver for the non-dev case)

mikemhenry commented 1 year ago

Feel free to punt that to a different PR :smile:

hmacdope commented 1 year ago

Feel free to punt that to a different PR 😄

Good idea, I have rasied an issue ( #15 ), but I think will proceed here with the CI as is.