Closed ngam closed 2 years ago
On licensing, note that all the packages have a license file with them.
Regarding the binary artifacts, we need to create an Overrides.toml to tell Julia where to find the replacement binaries: https://pkgdocs.julialang.org/v1/artifacts/#Overriding-artifact-locations
On licensing, note that all the packages have a license file with them.
Regarding the binary artifacts, we need to create an Overrides.toml to tell Julia where to find the replacement binaries: https://pkgdocs.julialang.org/v1/artifacts/#Overriding-artifact-locations
Thanks I remember we talked about the Override.toml long time ago.
On licensing, I never understand these things, so we will deal with it towards the end. We likely want some aggregate license near the root of the artifacts
I've seen enough, I think the issue with conflicting julia packages is resolved. Updated todo:
How? Do you want my input or not? If you do, don't prematurely say things like that.
Do you want my input or not?
I do!
If you do, don't prematurely say things like that.
I said that because I raised the issue myself earlier and I now believe it is resolved in my todo list, not necessarily speaking on behalf of others. Just tracking the issues I personally want to better understand before moving this forward.
I am happy to retract that and not tick the items on this todo list until we collectively reach agreement :)
Now, on the
How?
part.
I do not see how the packages will conflict. That's my primary concern. Do you still see a possibility of them conflicting or clobbering each other? I was initially worried about having duplicates or different versions coexisting, but apparently that's completely fine in Julia, so it should pass. The other concern that came up with clobbering due two packages installing the same things, but 1) that cannot happen if we name them correctly and 2) they're hidden from each other and they never replace each other anyway.
This leaves a milder issue still: Let's say package 'a' needs MyJuliaSolver.jl at version 3.1.1 and package 'b' needs MyJuliaSolver.jl at version 5.2.1. Now these two happen to be in the same conda environment, and one of them gets to be atop the other. Does that result in failure or will package 'a' find the version it wants from the paths it's offered even though they include the wrong version at a higher priority? I think the answer was yes, but I am not sure.
I don't understand how JULIA_PROJECT
is supposed to work. What happens when pysr
and xbitinfo
are both installed? Both of them would try to set JULIA_PROJECT
to their own name right?
Hmmmph. I thought they stacked, but now looking at it, we only stack the depots!
Since we have a similar JULIA_PROJECT declaration in the julia-feedstock, this leads me to think that the last one gets activated only... In the case of pysr alone, the activate projected is (@pysr-0.10.1) pkg>
, not the regular (@envxyz) pkg>
Projects do stack. JULIA_PROJECT
manipulates the current active environment. To manipulate the environment stack we need to modify JULIA_LOAD_PATH
.
I think we should not change the current active environment here. Rather PySR should set the environment variable JULIA_PROJECT
before loading Julia.
Currently, PySR requires that PyCall be installed in the current active environment, but that can easily not be the case. Thus the solution is for PySR to install PyCall and it's own Julia environment and then activate that.
Alternatively, we could modify JULIA_LOAD_PATH
and add @pysr-0.10.1
to the stack.
export JULIA_LOAD_PATH="@:@pysr-0.10.1:@{CONDA_PREFIX##*/}:@stdlib"
The load order then be:
$JULIA_PROJECT
@pyrsr-0.10.1
@$CONDA_PREFIX
the default one of the julia-feedstockIf we do this approach, then we should have a more modular way of building the load path or I need to get better at string munging with bash.
OK, I don't think this would lead to conflicts per se. My opinion would be to stack the projects like the depot paths. And I agree we can make it a condition for these packages like pysr to help us by setting better settings. That's the whole idea of formalizing this
What's the benefit of specifying the version btw? Remember in conda, you cannot have two versions of the same package in the same env. So how about we simplify our approach with:
export JULIA_LOAD_PATH="@:@pysr:$JULIA_LOAD_PATH"
What does the first @:
do?
I don't understand how
JULIA_PROJECT
is supposed to work. What happens whenpysr
andxbitinfo
are both installed? Both of them would try to setJULIA_PROJECT
to their own name right?
So in this case, if we do
export JULIA_LOAD_PATH="@:@package:$JULIA_LOAD_PATH"
export JULIA_DEPOT_PATH="${CONDA_PREFIX}/share/package:${JULIA_DEPOT_PATH}" # switched around, dropped `:`, will fix later
# then do we still need to set JULIA_PROJECT?? I would rather drop it...
export JULIA_PROJECT=@package
where package = ["pysr", "xbitinfo"].
Then, everything should work correctly, right? Or do we still need to special-case PyCall? That was my tripping point all along in earlier attempts.
What does the first
@:
do?
The documentation for JULIA_LOAD_PATH
environment variable and the LOAD_PATH
variable within Julia is located here:
https://docs.julialang.org/en/v1/base/constants/#Base.LOAD_PATH
@
refers to the "current active environment", the initial value of which is initially determined by the JULIA_PROJECT environment variable or the --project command-line option.@stdlib
expands to the absolute path of the current Julia installation's standard library directory.@name
refers to a named environment, which are stored in depots (see JULIA_DEPOT_PATH) under the environments subdirectory. The user's named environments are stored in ~/.julia/environments so@name
would refer to the environment in ~/.julia/environments/name if it exists and contains a Project.toml file. If name contains # characters, then they are replaced with the major, minor and patch components of the Julia version number. For example, if you are running Julia 1.2 then@v#.#
expands to@v1.2
and will look for an environment by that name, typically at ~/.julia/environments/v1.2.The fully expanded value of LOAD_PATH that is searched for projects and packages can be seen by calling the Base.load_path() function.
What does the first
@:
do?The documentation for
JULIA_LOAD_PATH
environment variable and theLOAD_PATH
variable within Julia is located here: https://docs.julialang.org/en/v1/base/constants/#Base.LOAD_PATH
@
refers to the "current active environment", the initial value of which is initially determined by the JULIA_PROJECT environment variable or the --project command-line option.@stdlib
expands to the absolute path of the current Julia installation's standard library directory.@name
refers to a named environment, which are stored in depots (see JULIA_DEPOT_PATH) under the environments subdirectory. The user's named environments are stored in ~/.julia/environments so@name
would refer to the environment in ~/.julia/environments/name if it exists and contains a Project.toml file. If name contains # characters, then they are replaced with the major, minor and patch components of the Julia version number. For example, if you are running Julia 1.2 then@v#.#
expands to@v1.2
and will look for an environment by that name, typically at ~/.julia/environments/v1.2.The fully expanded value of LOAD_PATH that is searched for projects and packages can be seen by calling the Base.load_path() function.
I could figure as much, but I was wondering why we are prepending it this way. Do we expect something to be active beforehand? I don't think so. Or does it become redundant because we are activating the same project twice?
The user can change the active project to whatever they want by either manipulating JULIA_PROJECT
or using julia --project
. However, they might still want to be able to load PyCall.jl and SymbolicRegression.jl regardless of what the active project is.
Or does it become redundant because we are activating the same project twice?
By default, there is usually a redundancy since @
is usually is the same as @#.#
PyCall.jl
Just so I fully understand: our previous approach (up to commit da50259) failed because of PyCall, right? PyCall is expected to live in a specific location.
Why don't we package PyCall and CondaPkg together with our julia-feedstock btw? Or even better, we can make a meta julia-python-package that basically sets these links carefully; it would pull in julia-feedstock and package PyCall and other relevant packages so that they're readily available.
That way, all we need to do is simply house the julia artifacts from pysr (xbitinfo, etc.) into $PREFIX/share/pysr (or equivalent) and append the depot paths. That was my initial thinking and I was very confused why it wasn't working.
There are two ways to call Python from Julia.
There are now also two ways to call Julia from Python, corresponding with the above, respectively:
julia
package in PyPI. This uses PyCall.jl behind the scenes. This is mainly authored by Takafumi Arakaki and is under community maintenance.You can kind of mix the two, but they otherwise form two independent methods of interop.
There is a pyjulia package in conda-forge. That should depend on PyCall.jl.
There probably should be a py-juliacall package in conda-forge as well that would in turn depend on PythonCall.jl.
I really do not think installing PyCall.jl by default in the julia-feedstock is a good idea. This would create a dependency on Python. If you want to depend on PyCall.jl, you should depend on PyCall.jl. Thus there should be a PyCall.jl package in conda-forge. Maybe it'll be called jl-pycall
or julia-pycall
.
OK. I will get more engaged with this again over the weekend. We should try to package these other packages. Let's try to give the community (i.e. isuruf) some time to cogitate and we can push forward with getting more of the pieces together for a more thorough demo
I made some improvements to the activation and deactivation script. I also moved the depot to $PREFIX/share/pysr/depot
.
macos tests pass on cc4a619. I will try to simplify.
We should probably merge https://github.com/conda-forge/julia-feedstock/pull/221 until we can figure out what has gone wrong with libgit2 on macos.
macos tests pass on cc4a619. I will try to simplify.
We should probably merge conda-forge/julia-feedstock#221 until we can figure out what has gone wrong with libgit2 on macos.
merged 🍏
Is this ready for review? Let me know and I can try it out.
Cheers, Miles
I need to clean up some settings that we pushed into the julia-feedstock.
One thing I think we should change in the Python code is to activate the pysr environment sooner by setting the JULIA_PROJECT
environment variable. This way the PyCall dependency is entirely contained within the environment.
@MilesCranmer , this is about where I want it without making modifications elsewhere.
The main thing is that we're putting PyCall into the pysr-0.10.1 environment and loading that environment directly rather than depending on whatever the current default environment is. I think this will be more robust than trying to install PySR in whatever the current environment is.
In the future, the plan for Julia packages in conda-forge would be:
When this is done, we should just be able to depend on this packages via conda-forge rather than having to package them explicitly here.
@ngam I think this is ready for @MilesCranmer to review.
This is not going to work, unless you package the julia packages as individual conda packages.
Say pysr
depended on foo.jl>=v1
and packaged foo.jl=v1
and another conda package pybar
depended on baz=v2
which depended on foo.jl>=v2
. Installing both of them and JULIA_LOAD_PATH
is set to pysr:pybar
, then pysr
will work correctly, but pybar
will not because JULIA_LOAD_PATH
has pysr
in the front and foo.jl=v1
will have priority.
This is not going to work, unless you package the julia packages as individual conda packages.
Say
pysr
depended onfoo.jl>=v1
and packagedfoo.jl=v1
and another conda packagepybar
depended onbaz=v2
which depended onfoo.jl>=v2
. Installing both of them andJULIA_LOAD_PATH
is set topysr:pybar
, thenpysr
will work correctly, butpybar
will not becauseJULIA_LOAD_PATH
haspysr
in the front andfoo.jl=v1
will have priority.
This is why I have suggested to @MilesCranmer that we specifically activate the pysr-0.10.1
project in the Python code earlier. This will make the pysr-0.10.1
project the highest priority. pysr actually currently does this, but it depends on PyCall.jl being in the active project before trying to activate the pysr-0.10.1
project.
The only dependency is that currently sensitive to the scenario you outlined is PyCall.jl. Having PySR set JULIA_PROJECT
will resolve that issue.
Here are some specific line references
This is where PySR activates it's Julia project: https://github.com/MilesCranmer/PySR/blob/d09ade8628f541eb94028009bacdb9a55cb22ef5/pysr/julia_helpers.py#L30
This is where the julia project is defined. https://github.com/MilesCranmer/PySR/blob/d09ade8628f541eb94028009bacdb9a55cb22ef5/pysr/julia_helpers.py#L63
The proposed change is that we insert this
import os
os.environ["JULIA_PROJECT"] = "pysr-0.10.1"
before PySR does
import julia
julia.install(quiet=quiet)
@MilesCranmer I think you can stream line the install by supplying both PackageSpec
via a Vector at the same time.
def _add_sr_to_julia_project(Main, io_arg):
Main.sr_spec = Main.PackageSpec(
name="SymbolicRegression",
url="https://github.com/MilesCranmer/SymbolicRegression.jl",
rev="v" + __symbolic_regression_jl_version__,
)
Main.clustermanagers_spec = Main.PackageSpec(
name="ClusterManagers",
url="https://github.com/JuliaParallel/ClusterManagers.jl",
rev="14e7302f068794099344d5d93f71979aaf4fbeb3",
)
Main.eval(f"Pkg.add([sr_spec, clustermanagers_spec], {io_arg})")
Let's review the status quo ante:
mamba create -n pysr_test -c conda-forge pysr
$ python
Python 3.10.6 | packaged by conda-forge | (main, Aug 22 2022, 20:36:39) [GCC 10.4.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import numpy as np, pysr
>>> X = 2 * np.random.randn(100, 5)
>>> y = 2.5382 * np.cos(X[:, 3]) + X[:, 0] ** 2 - 0.5
>>> from pysr import PySRRegressor
>>>
>>> model = PySRRegressor(
... model_selection="best", # Result is mix of simplicity+accuracy
... niterations=40,
... binary_operators=["+", "*"],
... unary_operators=[
... "cos",
... "exp",
... "sin",
... "inv(x) = 1/x",
... # ^ Custom operator (julia syntax)
... ],
... extra_sympy_mappings={"inv": lambda x: 1 / x},
... # ^ Define operator for SymPy as well
... loss="loss(x, y) = (x - y)^2",
... # ^ Custom loss function (julia syntax)
... )
>>> model.fit(X, y)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/home/mkitti/mambaforge/envs/pysr_test/lib/python3.10/site-packages/pysr/sr.py", line 1771, in fit
self._run(X, y, mutated_params, weights=weights, seed=seed)
File "/home/mkitti/mambaforge/envs/pysr_test/lib/python3.10/site-packages/pysr/sr.py", line 1464, in _run
Main = init_julia()
File "/home/mkitti/mambaforge/envs/pysr_test/lib/python3.10/site-packages/pysr/julia_helpers.py", line 88, in init_julia
raise ImportError(import_error_string())
ImportError:
Required dependencies are not installed or built. Run the following code in the Python REPL:
>>> import pysr
>>> pysr.install()
After running pysr.install()
we three packages installed across two environments. PyCall is installed in the default environment because that was what was active before doing pysr.install()
.
$ julia
(@pysr_test) pkg> st
Status `~/mambaforge/envs/pysr_test/share/julia/environments/pysr_test/Project.toml`
[438e738f] PyCall v1.94.1
(pysr-0.10.1) pkg> activate @pysr-0.10.1
Activating project at `~/mambaforge/envs/pysr_test/share/julia/environments/pysr-0.10.1`
(@pysr-0.10.1) pkg> st
Status `~/mambaforge/envs/pysr_test/share/julia/environments/pysr-0.10.1/Project.toml`
[34f1f09b] ClusterManagers v0.4.2 `https://github.com/JuliaParallel/ClusterManagers.jl#14e7302`
[8254be44] SymbolicRegression v0.10.1 `https://github.com/MilesCranmer/SymbolicRegression.jl#v0.10.1`
The status quo is that pysr.install()
will install PyCall into the current environment regardless if its the default or not.
With this pull request, and more so if Miles implements my recommendation and sets JULIA_PROJECT
before import julia
, we should have this:
(@pysr_test) pkg> st
Status `~/mambaforge/envs/pysr_test/share/julia/environments/pysr/Project.toml` (empty project)
(@pysr-0.10.1) pkg> st
Status `~/mambaforge/envs/pysr_test/share/pysr/depot/environments/pysr-0.10.1/Project.toml`
[34f1f09b] ClusterManagers v0.4.2 `https://github.com/JuliaParallel/ClusterManagers.jl#14e7302`
[438e738f] PyCall v1.94.1
[8254be44] SymbolicRegression v0.10.1 `https://github.com/MilesCranmer/SymbolicRegression.jl#v0.10.1`
That is PySR will no longer depend on having PyCall
in the default Julia project environment.
To summarize @isuruf :
pysr.install()
will then create it's own versioned environment (e.g. @pysr-0.10.1
) and install ClusterMangers.jl and SymbolicRegressions.jl into that environment.After the changes in this pull request,
@pysr-0.10.1
will more likely be the active Julia environment and will have PyCall.jl installed.@pysr-0.10.1
is not the current active environment, then pysr will prompt the user to do pysr.install()
. This will install PyCall.jl into the current environment and then switch to the existing @pysr-0.10.1
. It should then see that the other packages are installed.My recommendation to Miles is that we ensure that @pysr-0.10.1
is the current active environment before even trying to load to pyjulia
/ PyCall.jl. In that case, then we should not need to set the JULIA_PROJECT
in the activate.sh here.
In the worse case scenario where JULIA_PROJECT
and JULIA_LOAD_PATH
are clobbered by some other package, then we're back at the status quo ante. PySR will install PyCall.jl into the current environment, create a new environment for itself, and then activate it.
This is why I have suggested to @MilesCranmer that we specifically activate the
pysr-0.10.1
project in the Python code earlier.
I created a pull request to help address Isuru's points: https://github.com/MilesCranmer/PySR/pull/186
The pull request will set JULIA_PROJECT
to the pysr environment before trying to load pyjulia and PyCall.jl.
Okay, but it is not ideal. I misunderstood earlier, I thought you were saying that one could have foo.jl==v1 and foo.jl==v2 in the same stack and two different packages can still find their correct dep somehow. That's why I said "I've seen enough" prematurely.
I think, for now, it is better to let the package take care of installing its required artifacts after the fact. So this is not ready for prime time. My personal interest in this is to come up with a productive policy/strategy, and not a one-off solution. I am not even a proper user of Julia...
@isuruf I have a question for you: Is there any point in actually trying package these Julia packages here or is it a moot exercise? I am sure we can get some of them up and running pretty quickly, but there is no point if we cannot scale it like we do with Python/R packages. In other words, if that's the only way this will work out, then should we spend more time trying to come up with the Julia equivalent of {{ PYTHON }} -m pip install . -vvv
? The {{ JULIA }}
installer magic call...
Okay, but it is not ideal. I misunderstood earlier, I thought you were saying that one could have foo.jl==v1 and foo.jl==v2 in the same stack and two different packages can still find their correct dep somehow. That's why I said "I've seen enough" prematurely.
We can have multiple versions in the stack. What gets loaded will depend on the environment stack order though. Setting JULIA_PROJECT
will move that environment to the top of stack.
PySR already explicitly activates an environment. The main reason for activating it earlier is decrease its dependence on PyCall.jl being in the current environment. If we merge https://github.com/MilesCranmer/PySR/pull/186, then we can remove the JULIA_PROJECT
and JULIA_LOAD_PATH
modifications here, and then this will still work.
The added JULIA_DEPOT_PATH
depot will just act as a local package cache.
I think, for now, it is better to let the package take care of installing its required artifacts after the fact. So this is not ready for prime time. My personal interest in this is to come up with a productive policy/strategy, and not a one-off solution. I am not even a proper user of Julia...
What we need is an evolvable strategy that allows us to start somewhere and grow. This is an evolvable strategy. It allows to start somewhere practical. As we add individual packages to conda-forge and add those as dependencies to this project, the size of this depot will shrink.
What we need is an evolvable strategy that allows us to start somewhere and grow. This is an evolvable strategy. It allows to start somewhere practical. As we add individual packages to conda-forge and add those as dependencies to this project, the size of this depot will shrink.
I agree with you. I am interested in the "add individual packages to conda-forge" part. Do we currently have any julia-package in conda-forge? I don't think so. What would be a good candidate to submit to staged-recipes as an example to start thinking about this together?
(Also, sorry I said I would think about this more over the weekend, but things are suddenly super complicated on my end... I should still be able to help, but not as creatively because my mind is occupied with a confluence of some urgent and involved research projects in "real" life)
Let's give isuruf a bit of time to respond to packaging julia-packages as feedstocks. If there are blockers a priori, then pRIP. This would be my next step in this exciting story. If we can establish at least that a process is possible to achieve your vision, then we can ask for this to be reviewed and merged.
TLDR, how about these in order for now?
Or do you think this is unnecessarily elaborate? Instead, should we just experiment here for now?
I have PyCall.jl and Conda.jl ready to go. Someone just needs to tell me what to call the Julia packages. Even then, the overall strategy is exactly the same. You still will need to ship individual depots for both of them for the moment. To support PySR though, we need to get up to 100 packages. I'm definitely not doing that by myself. Also, to be clear doing this does not solve the problem that Isuru laid out.
Let's be practical and discuss the two large packages we actually have before us. To resolve the problem for SymbolicRegression.jl (used by pysr) and BitInformation.jl (used by xbitinfo) we need to add them to the same Julia project environment. I'll first point out that is not practical for either pysr or xbitinfo because both Python packages activate individual Julia environments when you run the Python packages. Let's say we do it anyways.
For either package, having both depots present does not hinder anything. For this discussion I will create two depots
These each respectively contain an environment.
To the environments, I added a single package:
Now we have two individual depots. Here's how many packages are installed.
$ ls pysr_depot/packages | wc -l
93
$ ls xbitinfo_depot/packages | wc -l
32
Now I'll construct a common depot and put at the top of the depot stack while placing the two other depots underneath it.
julia> empty!(DEPOT_PATH)
String[]
julia> mkpath("common_depot/environments/common_env")
"common_depot/environments/common_env"
julia> push!(DEPOT_PATH, "common_depot")
1-element Vector{String}:
"common_depot"
julia> push!(DEPOT_PATH, "pysr_depot")
2-element Vector{String}:
"common_depot"
"pysr_depot"
julia> push!(DEPOT_PATH, "xbitinfo_depot")
3-element Vector{String}:
"common_depot"
"pysr_depot"
"xbitinfo_depot"
julia> Pkg.activate("common_depot/environments/common_env")
Activating new project at `~/src/depottest/common_depot/environments/common_env`
julia> pkg"add SymbolicRegression BitInformation"
Resolving package versions...
...
julia> pkg"status"
Status `~/src/depottest/common_depot/environments/common_env/Project.toml`
[de688a37] BitInformation v0.6.0
[8254be44] SymbolicRegression v0.10.2
When I added both packages to the same environment, no downloads had to occur. In fact, if we look for common_depot/packages
we see that the directory does not even exist. We didn't need to "install" any additional packages!
julia> readdir(DEPOT_PATH[1])
4-element Vector{String}:
"compiled"
"environments"
"logs"
"scratchspaces"
What is in common_depot/environments/common_env
? Two files
$ ls common_depot/environments/common_env/
Manifest.toml Project.toml
$ cat common_depot/environments/common_env/Project.toml
[deps]
BitInformation = "de688a37-743e-4ac2-a6f0-bd62414d1aa7"
SymbolicRegression = "8254be44-1295-4e6a-a16d-46603ac705cb"
$ cat common_depot/environments/common_env/Manifest.toml
# This file is machine-generated - editing it directly is not advised
julia_version = "1.8.0"
manifest_format = "2.0"
project_hash = "c901445fd3719cd02549f8ebb4f6634de8639c29"
[[deps.AbstractFFTs]]
deps = ["ChainRulesCore", "LinearAlgebra"]
git-tree-sha1 = "69f7020bd72f069c219b5e8c236c1fa90d2cb409"
uuid = "621f4979-c628-5d54-868e-fcf4e3e8185c"
version = "1.2.1"
...
Practically, it may be possible to build the Project.toml file. We can then generate the Manifest.toml by invoking Pkg.resolve()
.
The summary of this experiment is as follows.
JULIA_LOAD_PATH
. The only modification to code loading I did was creating the depot stack and activating different projects.You guys are getting distracted by the JULIA_PROJECT
and JULIA_LOAD_PATH
manipulation. This is only temporarily filling in for something the Python packages themselves can do. The main point is that we can package depots with preconfigured environments that do not need further installation.
Solving the conflicting version dependency issue is one step further. It either needs the user to create the common environment or we need to do something in a post-link script. Nonetheless, packaging the depots help rather than hinder the creation of the common environment. They also allow the distinct environments to be completely installed.
Were there any version clashes? Yes.
pysr_env/Manifest.toml
[[deps.DocStringExtensions]]
deps = ["LibGit2"]
git-tree-sha1 = "b19534d1895d702889b219c382a6e18010797f0b"
uuid = "ffbed154-4ef7-542d-bbb7-c09d3a79fcae"
version = "0.8.6"
xbitinfo_env/Manifest.toml
[[deps.DocStringExtensions]]
deps = ["LibGit2"]
git-tree-sha1 = "5158c2b41018c5f7eb1470d558127ac274eca0c9"
uuid = "ffbed154-4ef7-542d-bbb7-c09d3a79fcae"
version = "0.9.1"
common_env/Manifest.toml
[[deps.DocStringExtensions]]
deps = ["LibGit2"]
git-tree-sha1 = "b19534d1895d702889b219c382a6e18010797f0b"
uuid = "ffbed154-4ef7-542d-bbb7-c09d3a79fcae"
version = "0.8.6"
Here's the individual output of pkg"status -m"
for each of the Manifests.toml
Let's give isuruf a bit of time to respond to packaging julia-packages as feedstocks. If there are blockers a priori, then pRIP. This would be my next step in this exciting story. If we can establish at least that a process is possible to achieve your vision, then we can ask for this to be reviewed and merged.
TLDR, how about these in order for now?
- Can we get a process rolling to packaging julia-stuff?
- Submit a CFEP and ask for a vote/feedback from more members of the community.
- Ask maintainer here to review and merge.
Or do you think this is unnecessarily elaborate? Instead, should we just experiment here for now?
That should happen but the path for this pull request is much simpler.
JULIA_PROJECT
and JULIA_LOAD_PATH
manipulation entirely which should squash Isuru's concerns.JULIA_DEPOT_PATH
manipulation and the additional depot. All that does is locally cache Julia packages making it easier to use the Python packages.The goal here is to make the pysr.install()
is no longer necessary. All that needs to be happen is we need to move the activation of the pysr
Julia environment before we try to load pyjulia
. Once that happens, then the conda environment does not need to directly manipulate the Julia environment stack directly. This means there is no longer potential interference between conda packages. At that point, I think the final decision is almost entirely within Miles' domain.
In the meantime, could you update the package on you have on your channel, @ngam before I start breaking things again?
Once we merge a version of that pull request, we can see how much activation script support we need here. I think we can pull out the JULIA_PROJECT and JULIA_LOAD_PATH manipulation entirely which should squash Isuru's concerns.
No, it doesn't. It just moves the problem to pysr
and whatever the python package is. import julia
will only use the project env variable the first time it's used and for the second project it's a no-op. This is also a bad fix, because depending on
import pysr
import xbitinfo
vs
import xbtinfo
import pysr
the location will change.
In my opinion, the current solution (before this PR) avoids these issues by having one common environment and is superior to the solution proposed in this PR.
In my opinion, the current solution (before this PR) avoids these issues by having one common environment and is superior to the solution proposed in this PR.
The current solution (before this PR) does not exclusively use one common environment. It starts pyjulia and calls julia.install()
which installs PyCall.jl into whatever the current environment is active. Then it activates @pysr-0.10.1
via Pkg.activate
to install SymbolicRegression.jl and ClusterManagers.jl.
While there is a good argument for a conda package to install PyCall.jl into the common environment that package should probably be either pyjulia
itself or perhaps julia-pycall
.
the location will change.
No because neither package does anything upon import. Currently pyjulia
, pysr
, and soon xbitinfo
are configured to install packages via [pkg].install()
. The environment they activate is also configurable.
For cross reference here is the xbitinfo pull request which copies the current pysr approach (before this pull request): https://github.com/observingClouds/xbitinfo/pull/132
https://github.com/observingClouds/xbitinfo/blob/5b59b00fb191a12ea8ab5b7804ce4ded87aca593/xbitinfo/julia_helpers.py#L14-L47
The install
function in julia_helpers.py will do the following:
julia.install()
which will install PyCall.jl into whatever the current Julia environment is.@xbitinfo-{version}
environment.@xbitinfo-{version}
environment@xbitinfo-{version}
environment@xbitinfo-{version}
environment
To be clear, I'm not the one introducing new environments per Python package with this pull request. That existed in pysr
before and is now being duplicated into xbitinfo
.
What I'm trying to do is move the install()
step into build phase so they can be managed by conda
. The order of the install()
invocations is what will result in variability, and I'm trying to eliminate that variability. If pysr.install()
is invoked first and then xbitinfo.install()
is invoked second, then xbitinfo.install()
will end up installing PyCall.jl into the pysr environment.
What we want at the end of the day is for conda
or mamba
to actually do the installation so that eventually conda/mamba
can eventually do dependency management.
If two python packages (pysr, xbitinfo) try to create two different environments and activate both julia environments, those two python packages are fundamentally not compatible with each other and we should disallow installing both python packages to the same conda environment.
fixes #38
Checklist
0
(if the version changed)conda-smithy
(Use the phrase code>@<space/conda-forge-admin, please rerender in a comment in this PR for automated rerendering)