PyPy compatibility - Githubissues

jaimergp commented 3 years ago

We are trying to use setup-miniconda to enable PyPy CI pipelines in https://github.com/openmm/openmm/pull/3086, but we are facing some issues with the python-version option. Seems that the action adds python=x.y to the environment, and that causes some issues if python_abi=pypy.

I am going to run some tests locally, but in the meantime I am wondering if people have run into the same issue.

I will document my progress here because setting pypy is not very straightforward right now and we might need to document it.

jaimergp commented 3 years ago

Running some tests in this PR. That seems to work. Am I missing something? Does the new conda have a new bug?

goanpeca commented 3 years ago

Hi @jaimergp

. Am I missing something? Does the new conda have a new bug?

What do you mean :) ?

So the workaround is to just use python version? or could we add the *=pypi to whatever python version the user picks or we pick?

Or ?

Thoughts @bollwyvl ?

bollwyvl commented 3 years ago

What does a valid pypy spec look like? If python appears on the left-hand side, we can make it work in makeSpec.

Otherwise, we would need a special case.

jaimergp commented 3 years ago

Yes, sorry, I wasn't too clear.

This conda environment file with this CI yaml results in this long error.

jaimergp commented 3 years ago

In simple cases, python-version: "3.6.*=*pypy*" seems to work!

goanpeca commented 3 years ago

that is a weird spec, is that valid :-p ?

python_abi * *pypy*

bollwyvl commented 3 years ago

or slightly more verbose:

python-version: 3.7[build=*_pypy]

jaimergp commented 3 years ago

I thought you could replace the = symbols with spaces in a yaml file but maybe that's only on meta.yaml and not environment.yml?

bollwyvl commented 3 years ago

dunno... verified locally with mamba and conda. the asterisks and spaces complained.

name: foo

channels:
  - conda-forge

dependencies:
  - python=3.7[build=*_pypy]

might need channel-priority: strict (or better still, a .condarc)

bollwyvl commented 3 years ago

looking at my regex, it would probably be:

python-version: =3.7[build=*_pypy]
#               ^
#   wasn't accounting for some things

jaimergp commented 3 years ago

This simple CI file works correctly:

Environment patch works
PyPy is installed

goanpeca commented 3 years ago

So should we update the docs or make sure we append something to python version when using pypy?

bollwyvl commented 3 years ago

docs sounds simple...

`python-version` may be either:
- a simple version number like `3.9`, which will have a `=` inserted, yielding `python=3.9`
- a full [conda spec][], e.g. `=3.7[build=*_pypy]`, which will be used as-is, yielding `python=3.7[build=*_pypy]`

[conda spec]: https://docs.conda.io/projects/conda-build/en/latest/resources/package-spec.html#package-match-specifications

jaimergp commented 3 years ago

Might be off-topic, and already discussed, but just in case:

What I've seen is that the environment is patched by adding a new - python {{ python-version }} line, which works as long as the original line didn't specify any restraints. But if the existing python dependency entry specifies constraints, how is that handled? Should it be replaced? Error out instead of undefined behavior?

hadim commented 3 years ago

I was looking for a way to patch an env file recently (for usage inside a docker image but also inside a setup-conda GA) but couldn't find anything. I needed something that work for python but also other packages. So I ended up building the following small script:

import sys
import argparse
import yaml

def main(env_path, new_deps):

    # Process dependencies
    new_deps = [dep.split("=") for dep in new_deps]

    # Load the conda env file
    with open(env_path) as f:
        env_spec = yaml.load(f, Loader=yaml.SafeLoader)

    deps = env_spec["dependencies"]
    for name, version in new_deps:

        # Find whether the package already exists
        existing = list(
            filter(
                lambda x: True if isinstance(x, str) and x.startswith(name) else False,
                deps,
            )
        )

        # Remove the existing package(s)
        [deps.pop(deps.index(e)) for e in existing]

        # Add the new package spec if the spec is not None
        if version != "DELETE":
            if version == "":
                deps.append(name)
            else:
                deps.append(f"{name} ={version}")

    # Add the new dep list to the eocnda env spec
    env_spec["dependencies"] = deps

    # Serialize back to YAML and print on stdout
    sys.stdout.write(yaml.dump(env_spec))

if __name__ == "__main__":
    parser = argparse.ArgumentParser(description="Patch a conda env file")
    parser.add_argument(
        "--env",
        metavar="CONDA_ENV_PATH",
        type=str,
        help="Path to conda env file.",
    )
    parser.add_argument(
        "-d",
        "--deps",
        nargs="+",
        help="Dependencies to patch.",
    )

    args = parser.parse_args()
    main(args.env, args.deps)

And here is the GA step I use to do the patching (after the setup-miniconda step with auto-activate-base: true):

      - name: Setup conda env
        run: |
          # Patch the conda env file to specify the Python and RDKit version
          mamba install --yes pyyaml
          python .github/patch_conda_env.py --env env.yml -d \
            python=${{ matrix.python-version }} \
            rdkit=${{ matrix.rdkit-version }} \
            > env-patched.yml
          # Create conda env
          mamba create -n my_env

bollwyvl commented 3 years ago

yeah, dunno, would be tempted to use

sed if the specs aren't fancy
conda-lock and ordered env files if they are

hadim commented 3 years ago

I actually started with sed but it was hard to make it work cross-platform and also because of often fancy chars in the version strings. We also use conda-lock to "freeze" our docker image but after the patching step.

The above script is convenient as it works for any packages and also allows you to remove packages (useful for cudatoolkit for example). But it's always "one-more-script" to maintain and deal with... Ideally, that kind of logic would go in conda/mamba.

bollwyvl commented 3 years ago

re: adding an "extra" python: i thought it would try to remove the original spec, but perhaps it does not.

re: conda-lock for patching, etc: it accepts a variable number of --file arguments: for example, this set of envs generates this set of locks, doing some permutations off task name, os, gpu/cpu, etc. I like to have them checked in for long-term analysis (and because the build takes for-frigging-ever already) and offline testing, but there's nothing that requires that.

re: removing: If anything, getting a --not-file added to conda-lock might make it easier, but frankly i haven't yet minded it being additive-only. I guess I had had the conda-lock behavior in mind when bringing this feature in, that it would be additive-only, as there is no conda spec for "but without this"... other than run_constrained, and that's only available in meta.yaml.

I think once we get to this level of complexity, however, I'd pretty much not want to re-invent the wheel in typescript: if environment-files was an input, we'd just install conda-lock in the base env (much like conda-build-version), build the lockfile, and then re-use the explicit provider. the lockfile content/path could then become an output, which could drive caching, which would be quite nice.

I don't know if either conda/mamba would want to make the first move on supporting multiple files or "but-not-this" specs... and we're still learning what it really means on conda-lock.

conda-incubator / setup-miniconda

PyPy compatibility #170