JuliaPy / PyCall.jl

Package to call Python functions from the Julia language
MIT License
1.45k stars 186 forks source link

Urgent: add option to disable numpy install / prevent corruption of conda-forge environments #1040

Closed MilesCranmer closed 1 year ago

MilesCranmer commented 1 year ago

Right now PyCall.jl automatically installs numpy into a conda environment. This install cannot be disabled. If this occurs from within an existing conda environment, it can overwrite the numpy installed by a conda-forge build. This violates the assumptions of conda-forge, and can silently corrupt any conda env that depends on numpy.

This issue was noticed in https://github.com/conda-forge/pysr-feedstock/pull/85, manifesting in this issue: https://github.com/conda-forge/scipy-feedstock/issues/238. We believe this is caused by the line:

https://github.com/JuliaPy/PyCall.jl/blob/bcaba00d1e2c412b2f61d33343ef5a9ab1b9258a/deps/build.jl#L79

This PR enables the PYCALL_INSTALL_NUMPY environment flag which can be set to false to disable the Conda.add("numpy") line and prevent the corruption of a conda env. The default behavior is unchanged if this flag is not set, so this is backwards compatible.

cc @h-vetinari @mkitti @ngam

codecov-commenter commented 1 year ago

Codecov Report

Patch coverage has no change and project coverage change: +0.19 :tada:

Comparison is base (bcaba00) 67.03% compared to head (e6f3744) 67.22%.

Additional details and impacted files ```diff @@ Coverage Diff @@ ## master #1040 +/- ## ========================================== + Coverage 67.03% 67.22% +0.19% ========================================== Files 20 20 Lines 2017 2017 ========================================== + Hits 1352 1356 +4 + Misses 665 661 -4 ``` | Flag | Coverage Δ | | |---|---|---| | unittests | `67.22% <ø> (+0.19%)` | :arrow_up: | Flags with carried forward coverage won't be shown. [Click here](https://docs.codecov.io/docs/carryforward-flags?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=None#carryforward-flags-in-the-pull-request-comment) to find out more. [see 2 files with indirect coverage changes](https://app.codecov.io/gh/JuliaPy/PyCall.jl/pull/1040/indirect-changes?src=pr&el=tree-more&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=None)

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Do you have feedback about the report comment? Let us know in this issue.

mkitti commented 1 year ago

I'm not sure I'm following. All PyCall.jl wants to do is ensure that the conda environment it is operating in has numpy installed. Normally, this is a conda environment that is created specifically for a Julia install, but from a conda installed Julia this is an existing conda environment.

mkitti commented 1 year ago

Could we consider adding --freeze-installed instead?

MilesCranmer commented 1 year ago

Wait, I thought this line was the cause of the downstream issues? Doesn’t Conda.add() update the numpy version and reinstall?

MilesCranmer commented 1 year ago

https://docs.conda.io/projects/conda/en/latest/commands/install.html

Conda attempts to install the newest versions of the requested packages. To accomplish this, it may update some packages that are already installed, or install additional packages. To prevent existing packages from updating, use the --freeze-installed option.

(maybe that’s what you meant by --freeze-installed?)

MilesCranmer commented 1 year ago

I think it’s safer to entirely turn off calls to conda install when inside a conda-forge build script though. Just feels a bit dangerous. What if even with freeze-installed, it installs some other packages which are not present in the env that conda-forge is building?

mkitti commented 1 year ago

Actually, I think we want --satisfied-skip-solve

mkitti commented 1 year ago

What if even with freeze-installed, it installs some other packages which are not present in the env that conda-forge is building?

It should throw errors since you did not previously add all the needed dependencies.

MilesCranmer commented 1 year ago

Okay I added the required functionality in https://github.com/JuliaPy/Conda.jl/pull/240

mkitti commented 1 year ago

See https://github.com/JuliaPy/Conda.jl/pull/241

ngam commented 1 year ago

I think it’s safer to entirely turn off calls to conda install when inside a conda-forge build script though.

If this is possible, it is desirable. However, it likely should be done as a patch in conda-forge, not here. I will leave that up to you though. Why would PyCall need numpy to begin with? That’s kinda odd and arbitrary (isn’t it supposed to be a generic python caller?)

stevengj commented 1 year ago

Why would PyCall need numpy to begin with? That’s kinda odd and arbitrary (isn’t it supposed to be a generic python caller?)

In order to wrap Julia arrays with numpy arrays when calling Python.

stevengj commented 1 year ago

Why would PyCall need numpy to begin with? That’s kinda odd and arbitrary (isn’t it supposed to be a generic python caller?)

In order to wrap Julia arrays with numpy arrays when calling Python.

stevengj commented 1 year ago

Why would PyCall need numpy to begin with? That’s kinda odd and arbitrary (isn’t it supposed to be a generic python caller?)

In order to wrap Julia arrays with numpy arrays when calling Python.

MilesCranmer commented 1 year ago

@mkitti Updated with satisfied_skip_solve=true

mkitti commented 1 year ago

@stevengj could we merge and release this change?

MilesCranmer commented 1 year ago

Just pinging this - could this be merged @stevengj?

mkitti commented 1 year ago

I only have an auto-merge button here.

image

An chance we could get the other CI bits to pass? Python 2.7 should probably just be retired...

MilesCranmer commented 1 year ago

It says:

Error: Version 2.7 with arch x64 not found

so we can't even test it. Also, pyjulia already dropped support.

MilesCranmer commented 1 year ago

@stevengj sorry for the spam but we really need this fixed to solve some downstream failures. Any chance you could merge+release?

ngam commented 1 year ago

@MilesCranmer, is pycall available in conda-forge somehow? What in conda-forge uses it? If that's the case, we could patch it and move the ball forward instead of waiting.

ngam commented 1 year ago

Or is it the thing that Julia calls and we bundle in your pysr setup?

ngam commented 1 year ago

If so, we could point your pysr package on Conda-forge to use this https://github.com/MilesCranmer/PyCall.jl/tree/patch-numpy-install (I don't know how that's done in Julia, but I assume it is easily doable, right?)

MilesCranmer commented 1 year ago

Right, we bundle it with the PySR install as a special julia depot.