MDAnalysis / UserGuide

User Guide for MDAnalysis
https://userguide.mdanalysis.org
22 stars 34 forks source link

Develop installation instructions are outdated #368

Closed orbeckst closed 2 months ago

orbeckst commented 3 months ago

Describe the bug Instructions for installing from source (ie the develop branch) are not up to date. https://userguide.mdanalysis.org/stable/contributing_code.html

This makes it difficult to on board developers or use the develop branch for hackathons.

orbeckst commented 3 months ago

Is the following a reasonable way to set up the develop environment?

mamba create -n mda312dev python=3.12
mamba install -n mda312dev --only-deps mdanalysis mdanalysistests
mamba activate mda312dev
echo "Do your pip install -e in the mda312dev environment"

assuming that we don't need new deps?

orbeckst commented 3 months ago

You can add recipes/suggestions to the discussion post https://github.com/MDAnalysis/mdanalysis/discussions/4636

IAlibay commented 3 months ago

1) pulling from a deployed version is a bad idea - deps change and so do pins, it might work today but it's a very flaky solution at best. 2) Generally a dev environment should be full deps not min deps. This is closer to a min dep build.

RMeli commented 3 months ago

Mmmh... Can't we just point to the pypoject.toml file? That is the ultimate source of truth. We can tell them to install all the dependencies and the project.optional-dependencies dependencies they need.

If we want to make this really, really, easy we could parse the project.toml file and build a simple mamba install.

BTW, do we really need pip install? All the dependencies appear to be available on conda-forge (but probably I missed something...)

IAlibay commented 3 months ago

@RMeli the issues with pyproject.toml for develop builds is that this builds a purely "minimal deps" build - i.e. any conda-forge only dependencies won't work (e.g. openmm or rkdit that isn't in our pyproject.toml due to pip clash issues).

Regarding a pip install - we do need a pip install of the source code itself?

RMeli commented 3 months ago

Good point about packages not in the pyproject.toml. But my point still stands: if we point the users to where the ground truth is and we tell them how to interpret it, then the documentation will automatically stay up to date.

Regarding a pip install - we do need a pip install of the source code itself?

Of course. I meant for dependencies, sorry for the confusion.

RMeli commented 3 months ago

We could even provide something like this? (Code can definitely be improved, I wrote it with my brain mostly off.)

import yaml

with open(".github/actions/setup-deps/action.yaml", "rb") as f:
    data = yaml.safe_load(f)

install_str = "mamba install "

min_deps = data["runs"]["steps"][0]["env"]["CONDA_MIN_DEPS"].replace("{", "").replace("}", "").replace("$", "").replace("inputs.", "").split()
opt_deps = data["runs"]["steps"][0]["env"]["CONDA_OPT_DEPS"].replace("{", "").replace("}", "").replace("$", "").replace("inputs.", "").split()

for key, val in data["inputs"].items():
    if key in min_deps or key in opt_deps:
        install_str += f"'{val['default']}' "

print(install_str)

The output if run from the root directory of MDAnalysis is

mamba install 'codecov' 'cython' 'fasteners' 'griddataformats' 'gsd>3.0.0' 'hypothesis' 'matplotlib-base' 'mdahole2-base' 'mda-xdrlib' 'mmtf-python' 'numpy' 'packaging' 'pathsimanalysis' 'pip' 'pytest' 'scipy' 'threadpoolctl' 'tqdm>=4.43.0' 'waterdynamics' 'biopython>=1.80' 'chemfiles-python>=0.9' 'clustalw=2.1' 'distopia>=0.2.0' 'h5py>=2.10' 'hole2' 'joblib>=0.12' 'netcdf4' 'networkx' 'openmm' 'pytng>=0.2.3' 'rdkit>=2020.03.1' 'scikit-learn' 'seaborn>=0.7.0' 'tidynamics>=1.0.0'

It will need some tweaks because not all dependencies are available on all platform (i.e. distopia). But we can flag those manually, either in this script or directly in the action.yaml file as additional attributes. I think -c bioconda also needs to be added.

RMeli commented 3 months ago

Running the above on macOS with an M2 processor I get the following errors:

The following packages are incompatible
├─ clustalw 2.1**  does not exist (perhaps a typo or a missing channel);
├─ distopia >=0.2.0  does not exist (perhaps a typo or a missing channel);
└─ hole2 does not exist (perhaps a typo or a missing channel).

These are expected since these packages are not available for this platform. So we could just add on the docs to remove the packages listed under "The following packages are incompatible" and re-run the command.

IAlibay commented 3 months ago

I'm sorry but this thread doesn't really make a lot of sense to me right now - we seem to be looking to create fancy solutions to a problem I'm not convinced actually exists. Could someone restate why our current status quo is no good?

orbeckst commented 3 months ago

My read of the discussion here and https://github.com/MDAnalysis/mdanalysis/discussions/4636 is that we should start with a simple develop_environment.yaml file in the source tree (possibly removing/commenting out hole2, distopia, and clustalw (do we really still use clustalw???) — we can add a note for these). This file has to be maintained manually for right now but I'd expect that this doesn't happen too often.

We could then think about a check in CI that uses code similar to https://github.com/MDAnalysis/UserGuide/issues/368#issuecomment-2234251162 for extracting what we use in CI to what is listed in develop_environment.yaml and if that check fails, we know that we should be changing the develop_environment.yaml file.

orbeckst commented 2 months ago

We want to have updated installation instructions in time for the UGM so that we can have people work with develop during the hackathon.

cc @yuxuanzhuang as the hackathon organizer

orbeckst commented 2 months ago

I raised https://github.com/MDAnalysis/mdanalysis/issues/4645

Btw, what is the env file maintainer/conda/environment.yml good for? Can we just use that one?

tylerjereddy commented 2 months ago

fyi, we recently had someone add a script to propagate deps from pyproject.toml to requirements files upstream: https://github.com/scipy/scipy/blob/main/tools/generate_requirements.py

I guess Irfan's point about wanting the "full" deps and conda vs. PyPI land would still stand, though we had to add custom handling for problematic deps as well for different reasons.

yuxuanzhuang commented 2 months ago

For the sake of time, I’m going to create a PR to update the current installation instructions to ensure all dependencies are up-to-date. There are a few other issues that need to be addressed anyway, such as requiring Python >=3.10.

Having a environment.yml for developer installation would definitely be very helpful in the future.

orbeckst commented 2 months ago

@yuxuanzhuang , I like your approach as this means we'll at least make some progress. You can work off https://github.com/MDAnalysis/UserGuide/pull/370 if you like.

IAlibay commented 2 months ago

I did the above in #371

orbeckst commented 2 months ago

Closed with PR #371

orbeckst commented 2 months ago

Oops, I hadn't realized that PR #371 was a PR for PR #370, which will fix all of this here.