conda-forge / pysr-feedstock

A conda-smithy repository for pysr.
BSD 3-Clause "New" or "Revised" License
0 stars 6 forks source link

v0.10.4 with new build strategy #43

Closed ngam closed 2 years ago

ngam commented 2 years ago

fixes #38

Checklist

conda-forge-linter commented 2 years ago

Hi! This is the friendly automated conda-forge-linting service.

I just wanted to let you know that I linted all conda-recipes in your PR (recipe) and found it was in an excellent condition.

ngam commented 2 years ago

@conda-forge-admin, please rerender.

ngam commented 2 years ago

@mkitti and @MilesCranmer this works as expected. But have a look inside the logs...


DEBUG:conda_build.noarch_python:Don't know how to handle file: share/julia/packages/MultivariatePolynomials/1bIGc/test/utils.jl.  Including it as-is.
DEBUG:conda_build.noarch_python:Don't know how to handle file: share/julia/packages/StructArrays/w2GaP/src/structarray.jl.  Including it as-is.
DEBUG:conda_build.noarch_python:Don't know how to handle file: share/julia/packages/InverseFunctions/clEOM/src/test.jl.  Including it as-is.
DEBUG:conda_build.noarch_python:Don't know how to handle file: share/julia/packages/Zygote/qGFGD/docs/src/internals.md.  Including it as-is.
DEBUG:conda_build.noarch_python:Don't know how to handle file: share/julia/packages/IRTools/017wp/src/passes/relooper.jl.  Including it as-is.
DEBUG:conda_build.noarch_python:Don't know how to handle file: share/julia/packages/StaticArraysCore/dagUH/Project.toml.  Including it as-is.
DEBUG:conda_build.noarch_python:Don't know how to handle file: share/julia/packages/Transducers/HBMTc/benchmark/bench_sum_transpose.jl.  Including it as-is.
DEBUG:conda_build.noarch_python:Don't know how to handle file: share/julia/packages/StaticArrays/68nRv/src/MMatrix.jl.  Including it as-is.
DEBUG:conda_build.noarch_python:Don't know how to handle file: share/julia/packages/Bijections/IWrOY/.travis.yml.  Including it as-is.
DEBUG:conda_build.noarch_python:Don't know how to handle file: share/julia/packages/PyCall/ygXW2/src/startup.jl.  Including it as-is.
DEBUG:conda_build.noarch_python:Don't know how to handle file: share/julia/clones/418375808345381708/refs/tags/v0.3.3.  Including it as-is.
DEBUG:conda_build.noarch_python:Don't know how to handle file: share/julia/packages/JSON/NeJ9k/test/parser/null.jl.  Including it as-is.
DEBUG:conda_build.noarch_python:Don't know how to handle file: share/julia/packages/StatsAPI/y7Ydc/src/regressionmodel.jl.  Including it as-is.
DEBUG:conda_build.noarch_python:Don't know how to handle file: share/julia/packages/StatsBase/XgjIN/docs/src/deviation.md.  Including it as-is.
DEBUG:conda_build.noarch_python:Don't know how to handle file: share/julia/packages/Metatheory/CduBp/src/Library.jl.  Including it as-is.

also @ocefpaf

ngam commented 2 years ago

This is obviously just a draft. I will want to completely tweak this so that we do it more properly. Some issues to address:

ngam commented 2 years ago

@mkitti pls see how it is structured: if you download the artifacts from here https://artprodeus21.artifacts.visualstudio.com/A910fa339-c7c2-46e8-a579-7ea247548706/84710dde-1620-425b-80d0-4cf5baca359d/_apis/artifact/cGlwZWxpbmVhcnRpZmFjdDovL2NvbmRhLWZvcmdlL3Byb2plY3RJZC84NDcxMGRkZS0xNjIwLTQyNWItODBkMC00Y2Y1YmFjYTM1OWQvYnVpbGRJZC81NTk1ODkvYXJ0aWZhY3ROYW1lL2NvbmRhX2FydGlmYWN0c18yMDIyMDgyOC42LjFfbGludXhfNjRf0/content?format=zip then you will find this package under "broken", then you can unzip it and see how it got structured

The test still fails with pycall not being installed correctly....

mkitti commented 2 years ago

The depot we package should only contain two directories.

  1. packages
  2. artifacts

That's why I created a temporary directory for the depot and then only copied those two directories.

conda-forge-linter commented 2 years ago

Hi! This is the friendly automated conda-forge-linting service.

I wanted to let you know that I linted all conda-recipes in your PR (recipe) and found some lint.

Here's what I've got...

For recipe:

conda-forge-linter commented 2 years ago

Hi! This is the friendly automated conda-forge-linting service.

I just wanted to let you know that I linted all conda-recipes in your PR (recipe) and found it was in an excellent condition.

mkitti commented 2 years ago

How do you want to populate the new depot? I don't see code to do that at the moment.

ngam commented 2 years ago

How do you want to populate the new depot? I don't see code to do that at the moment.

For now, I am simply populating the main julia depot, i.e. under share/julia. Or am I misunderstanding things? It seems to be working insofar it is packaging everything. There is a minor issue of cleaning this up eventually, but that's a different issue. I want to try to get it to work. From the standpoint of conda-forge, it doesn't matter where the depot is as long as it is under $PREFIX. The only thing I want to know is to have julia, etc be aware of all these things we are packaging.

My worry is that in the packaging part, a lot of patching and editing is happening and corrupting these packaged julia entities

ngam commented 2 years ago

e.g.

INFO:conda_build.build:Packaging pysr
Packaging pysr-0.10.1-py310hff52083_0
INFO:conda_build.build:Packaging pysr-0.10.1-py310hff52083_0
compiling .pyc files...
number of files: 4069
WARNING :: get_rpaths_raw()=[] and patchelf=['$ORIGIN'] disagree for /home/conda/feedstock_root/build_artifacts/pysr_1661815576484/_h_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placeh/share/julia/artifacts/f3bb2ffc2ce484352a9fabe39f3ac6516c14f259/lib/libLLVMExtra-13.so :: 
patchelf: open: Permission denied
WARNING :: get_rpaths_raw()=[] and patchelf=[''] disagree for /home/conda/feedstock_root/build_artifacts/pysr_1661815576484/_h_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placeh/share/julia/artifacts/abf4b5086b4eb867021118c85b2cc11a15b764a9/lib/libopenspecfun.so.1.4 :: 
patchelf: open: Permission denied
WARNING :: Failed to get_static_lib_exports(/home/conda/feedstock_root/build_artifacts/pysr_1661815576484/_h_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placeh/lib/libbz2.a)
Unknown format
ngam commented 2 years ago

I don't see the difference between creating a new depot and keeping the default depot. Maybe I am not following your logic.

ngam commented 2 years ago

PyCall is definitely under $PREFIX, but the check doesn't seem to see it...

share/julia/compiled/v1.8/PyCall/GkzkC_AoYtv.ji (binary): Patching
mkitti commented 2 years ago

Like environments, the depots also stack. With the new depot, we will deliver a set of packages and a preconfigured environment. This will be a read-only environment behind the main one.

mkitti commented 2 years ago

We have to create a new depot otherwise Julia packages within the depot might collide with other Julia packages installed by other conda-forge packages.

ngam commented 2 years ago

So the idea is for each package to have its own depot in the hierarchy? What if someone has both pysr and xbitinfo in the same conda env? Which one do we choose to satisfy?

mkitti commented 2 years ago

So the idea is for each package to have its own depot in the hierarchy? What if someone has both pysr and xbitinfo in the same conda env? Which one do we choose to satisfy?

That's the beauty of this. Both depots are stacked below the main one. All the depots do is serve as a versioned package store. Julia can pull packages from both depots if needed. Julia will only write to the main depot named after the conda environment.

ngam commented 2 years ago

That's the beauty of this. Both depots are stacked below the main one. All the depots do is serve as a versioned package store. Julia can pull packages from both depots if needed. Julia will only write to the main depot named after the conda environment.

If we can pull this off safely, it can be a game changer! I am running this locally now, it builds quicker

mkitti commented 2 years ago

I've got commits pending after we see the results of this run.

mkitti commented 2 years ago

Do you know what this is about?

julia.core.JuliaError: Exception 'failed to clone from https://github.com/MilesCranmer/SymbolicRegression.jl, error: GitError(Code:ERROR, Class:SSL, Your Julia is built with a SSL/TLS engine that libgit2 doesn't know how to configure to use a file or directory of certificate authority roots, but your environment specifies one via the JULIA_SSL_CA_ROOTS_PATH variable. If you believe your system's root certificates are safe to use, you can export JULIA_SSL_CA_ROOTS_PATH="" in your environment to use those instead.)' occurred while calling julia code:

ngam commented 2 years ago

I've got commits pending after we see the results of this run.

This run doesn't work, same error. The rm -rf "${FAKEDEPOT}" doesn't seem to be working correctly, as it is still packaging something like share/SymbolicRegression.jl.

We will deal with the osx git thing later, it is likely to do with how we are using git on the julia-feedstock and what we rely on in conda-forge (doesn't look like a serious problem, I am sure we will handle it fine later)

ngam commented 2 years ago

Do you know what this is about?

julia.core.JuliaError: Exception 'failed to clone from https://github.com/MilesCranmer/SymbolicRegression.jl, error: GitError(Code:ERROR, Class:SSL, Your Julia is built with a SSL/TLS engine that libgit2 doesn't know how to configure to use a file or directory of certificate authority roots, but your environment specifies one via the JULIA_SSL_CA_ROOTS_PATH variable. If you believe your system's root certificates are safe to use, you can export JULIA_SSL_CA_ROOTS_PATH="" in your environment to use those instead.)' occurred while calling julia code:

https://github.com/conda-forge/julia-feedstock/blob/7fb588639bb7a857c91e5035423657dd5526c3e6/recipe/build.sh#L23 (not using system git but linking to system cert upon activation causes the confusion, I think)

mkitti commented 2 years ago

We may need to set the environment variable JULIA_PKG_USE_CLI_GIT=true then?

mkitti commented 2 years ago

Are the activate scripts functional before the tests?

edit: They were not being installed. They are now, I think..

ngam commented 2 years ago

OK, works locally. I will push the necessary edits.

ngam commented 2 years ago

TLDR: we have a working prototype for completely enmeshing julia and python packages

thanks mainly to @mkitti for pulling this off!

to test: mamba create -n pysr pysr -c ngam and see artifacts here https://anaconda.org/ngam/pysr/files

@ocefpaf @isuruf + @conda-forge/core

Before we go any further we need your feedback and your opinion on this. This will obviously need agreement from core to proceed. It is pretty unusual step...

The strategy here is thus:

  1. when building a python package (pysr) that needs julia packages (SymbolicRegression.jl), we build the python part as usual, then build the julia part and tuck it in a temporary place
  2. move just enough of the temporary artifacts under $PREFIX/share, naming them appropriately (TBD), so that they get packaged by conda-build
  3. we delete the rest of the unneeded artifacts
  4. upon installing (e.g. mamba create -n pysr pysr -c ngam) an activation script activates the julia project and gets the already-preinstalled artifacts up and running right away.

Points to address: see below

mkitti commented 2 years ago

This run doesn't work, same error. The rm -rf \"${FAKEDEPOT}\" doesn't seem to be working correctly, as it is still packaging something like share/SymbolicRegression.jl.

This the intention. It should package share/SymbolicRegression.jl/depot, which contains all the dependencies for the package.

This way we can package the high level packages first, and then eventually do the low levels ones as needed.

ngam commented 2 years ago

@mkitti that was months worth of fun! Thanks!!!!!!

Logging off to handle a crazy but super important week :/ hopefully the world doesn't end. To be continued.

mkitti commented 2 years ago

There are two binary artifacts installed.

  1. libopenspecfun from https://github.com/JuliaBinaryWrappers/OpenSpecFun_jll.jl conda-forge should be able to replace this with https://github.com/conda-forge/openspecfun-feedstock

  2. LLVMExtra from https://github.com/JuliaBinaryWrappers/LLVMExtra_jll.jl https://github.com/maleadt/LLVM.jl https://github.com/JuliaPackaging/Yggdrasil/blob/e99675ef8f0c852e163dbf39ecd2f4dd246d884c/L/LLVMExtra/build_tarballs.jl

mkitti commented 2 years ago

Also a note to @MilesCranmer , this allows us to install a Manifest.toml as part of the depot.

mkitti commented 2 years ago

Here's some reading about how the depot stack works.

https://docs.julialang.org/en/v1/manual/environment-variables/#JULIA_DEPOT_PATH https://docs.julialang.org/en/v1/base/constants/#Base.DEPOT_PATH https://docs.julialang.org/en/v1/manual/code-loading/#code-loading

ocefpaf commented 2 years ago

We are kind of close to a julia package wrapping. Not sure how far down that road we want to go but I'd love to see a feedstock for say, SymbolicRegression.jl, that pysr and other packages could use.

ngam commented 2 years ago

but I'd love to see a feedstock for say, SymbolicRegression.jl, that pysr and other packages could use

Ultimately this is the goal, though to be honest, I am not sure how we want to go about this. The obvious and desirable solution is to just package everything needed from the growing Julia ecosystem here (all these .jl packages, but we have to ensure that the _jll ones are equivalent to our libraries here). However, this may be a disastrous waste of time because the design philosophy of Julia's Pkg is fundamentally different (but I am not sure how (in)flexible it is). I am yet to fully understand why they wanted to go their own way in terms of packaging and not rely on third-party, more versatile systems like conda-forge. Maybe it is not too late and maybe we can push this forward without buy-in from upstream, piece by piece as people request packages, but that's beyond my vision at this point in time. (For example, SymbolicRegression.jl has 95 different dependencies that will need to be packaged --- thus, 95 additional feedstocks?)

isuruf commented 2 years ago

This is not going to work. What happens when there's another python package that relies on the same Julia package? There'll be conflicts right?

mkitti commented 2 years ago

This is not going to work. What happens when there's another python package that relies on the same Julia package? There'll be conflicts right?

No, there will be no conflicts. The Julia depots stack. Each conda package can supply their own read-only depot containing their dependencies and a shared environment. It is not a problem that multiple depots supply the same package since the stack has a priority order. The packages themselves are also versioned and multiple versions of the same package are allowed to exist within the same depot.

What the user may need to do is switch Julia projects within the conda environment or manually combine them.

See https://docs.julialang.org/en/v1/base/constants/#Base.DEPOT_PATH for further reading. In particular,

The first entry is the "user depot" and should be writable by and owned by the current user. The user depot is where: registries are cloned, new package versions are installed, named environments are created and updated, package repos are cloned, newly compiled package image files are saved, log files are written, development packages are checked out by default, and global configuration data is saved. Later entries in the depot path are treated as read-only and are appropriate for registries, packages, etc. installed and managed by system administrators.

conda-forge is participating at the system administrator level by adding read-only depots that install packages at the "system" level. To bootstrap this we can have a depot per package, with potential overlap. Eventually perhaps we can have a single conda-forge depot that we manage. We'll have to get to a 1:1 correspondence between conda-forge packages and Julia packages though.

In summary, conda-forge now has a mechanism to install packages for the user by adding read-only depots to the Julia stack. We can bootstrap this process by individual packages adding their own depots to the stack. We can later unify these into a single conda-forge managed depot.

isuruf commented 2 years ago

The Julia depots stack. Each conda package can supply their own read-only depot containing their dependencies and a shared environment. It is not a problem that multiple depots supply the same package since the stack has a priority order. The packages themselves are also versioned and multiple versions of the same package are allowed to exist within the same depot.

The priority order changes depending on which conda package was activated first. Then you'll use a depot from another conda package, right?

mkitti commented 2 years ago

The priority order changes depending on which conda package was activated first. Then you'll use a depot from another conda package, right?

For packages where they overlap, yes, but they should be the same package at the same version if they have the same hash. The user's depot in $CONDA_PREFIX/share/julia has the highest priority.

Note that the depot structure is $DEPOT/packages/$PACKAGE_NAME/$HASH/. For example, I have $DEPOT/packages/PyCall/{BcTLp,BD546,L0flP} at the moment representing PyCall 1.92.1, 1.92.3, and 1.93.0 in my main Julia depot at the moment.

Let's keep this in perspective that we're currently talking about two Python packages at the moment that have Julia dependencies. Their direct dependencies overlap at PyCall.jl and its dependency Conda.jl. Those are the first dependencies to target.

Eventually, we would like to put each Julia package into its own conda-forge package and ultimately installing these into a single conda-forge depot would be the goal. That depot would still probably coexist with an user depot. However, having multiple (currently 2) conda installed depots for pysr and xbitinf is a step in the the right direction as we move towards that ideal. What we gain from this initial move is conda management and installation of Julia packages. This greatly improves upon the current user experience which involves the user running an additional install step that uses the Julia package manager independently of conda-forge.

If we want to bring more languages into conda-forge, we need transitional strategies that allow for incremental steps.

In the particular case of SymbolicRegression.jl, there are 86 non-standard-library dependencies: https://juliahub.com/ui/Packages/SymbolicRegression/X2eIS/0.10.2?page=1

ngam commented 2 years ago

@mkitti can you tell us more about the sharing between depots? What prevents us from mixing the depots or having them automatically share? Don't these packages know where to look? Or will pysr get confused by, say, PyCall X if PyCall Y is also available?

Can we customize Pkg such that it works on "environments" as opposed to "projects"? In conda-forge, we have an $PREFIX/lib, $PREFIX/bin, and $PREFIX/lib/python3.X/site-packages, etc. etc. and all packages in a given environment share and access all these packages/libraries. Is there a we can target the equivalent of this here instead of these depots? If not, why not?

This will solve only the question of Julia packages. The other question is the shared/compiled libraries/binaries that you highlighted. We can alleviate that by simply packaging those in conda-forge on as-needed basis.

Eventually, we would like to put each Julia package into its own conda-forge package and ultimately installing these into a single conda-forge depot would be the goal. That depot would still probably coexist with an user depot. However, having multiple (currently 2) conda installed depots for pysr and xbitinf is a step in the the right direction as we move towards that ideal. What we gain from this initial move is conda management and installation of Julia packages. This greatly improves upon the current user experience which involves the user running an additional install step that uses the Julia package manager independently of conda-forge.

I am fine with the transition stuff. However for the eventual setup, I would say we already have a structure and we should stick to it. There is no point in reinventing stuff or making exceptions for Julia. Instead, we ideally should make Julia behave like our generic setup (and already versatile setup --- we package C, C++, R, etc. and we are doing pretty well, so the problem isn't with our approach)

ngam commented 2 years ago

Also, mkitti, the community has some processes to propose things, so don't be discouraged by isuruf right away. Let's keep talking and working on this. Once we have a solid setup, we can propose it in a CFEP or something: https://github.com/conda-forge/cfep. I am happy to draft one once we get a good handle on this and reflect on it

You're really helping push the frontier here as you are knowledgeable and enthusiastic about this more than all of us :) I am in your corner all the way!!!

mkitti commented 2 years ago

@mkitti can you tell us more about the sharing between depots? What prevents us from mixing the depots or having them automatically share? Don't these packages know where to look? Or will pysr get confused by, say, PyCall X if PyCall Y is also available?

Julia Pkg.jl will go down the depot stack looking for compatible packages. If it does not find them, then it will try to download the latest code into the primary depot at the top of the stack. As configured here, we include a pysr-0.1.10 project environment that also includes a Manifest.toml. That Manifest.toml specifies the use of particular versions of packages, the exact versions we put into the depot.

Now there might be sharing between depots if another depot provides the exact same package at the exact same version. If it's the exact same package though, is there really a problem? There is a potential other issue which is that some packages such as Conda.jl currently stores some configuration state in a deps.jl relative to the code: https://github.com/JuliaPy/Conda.jl/blob/master/src/Conda.jl#L20 We should change that to use https://github.com/JuliaPackaging/Preferences.jl

Can we customize Pkg such that it works on "environments" as opposed to "projects"? In conda-forge, we have an $PREFIX/lib, $PREFIX/bin, and $PREFIX/lib/python3.X/site-packages, etc. etc. and all packages in a given environment share and access all these packages/libraries. Is there a we can target the equivalent of this here instead of these depots? If not, why not?

These depots are essentially the same as $PREFIX/lib/python3.X/, perhaps slightly larger in scope because it occasionally brings in some binaries via artifacts. I think we should eventually have a single conda-forge managed depot that is read-only to the user from the tooling perspective. That is Pkg.jl should not be trying to modify the conda-forge managed repository. To get there we need to have individual conda-forge packages for individual Julia packages. The problem with having a single depot is that the user can modify the top one, easily.

Note that the standard Julia configuration uses three depots:

  1. ~/.julia where ~ is the user home as appropriate on the system;
  2. an architecture-specific shared system directory, e.g. /usr/local/share/julia;
  3. an architecture-independent shared system directory, e.g. /usr/share/julia.

Currently we have moved ~/.julia to $PREFIX/share/julia. Now what I'm proposing is that we replace /usr/local/share/julia.

As far as I can tell, interference between package managers exists with other languages. If I need to use pip, I usually use it after I use conda and then I avoid manipulating the environment. Instead imagine if pip and conda had distinct places to store packages but they were still stacked so that I can safely use conda after using pip. Perhaps my understanding of this is slightly outdated.

ngam commented 2 years ago

Now there might be sharing between depots if another depot provides the exact same package at the exact same version. If it's the exact same package though, is there really a problem?

No problem, at least not in my understanding, unless this very situation trips Pkg or the the specific pkg somehow which I don't see happening if it goes down in a hierarchical fashion (though, I am not sure how each pkg will look for another or if there are any relative links or something like that). However, this can potentially be wasteful later on (i.e. installing a 20 different PyCalls when one is sufficient)

These depots are essentially the same as $PREFIX/lib/python3.X/

Not quite because we have guardrails against sharing the same binary (e.g. since I saw you made commits to HDF5 stuff recently: it is wild that h5py, netcdf4, etc. vendor different (or the same) libhdf5 libraries --- but only if you get them from pypi, but they share the same exact shared libhdf5 if you get them from conda-forge... this results in weird problems with deadlocks and whatever). We are going to have to address this eventually, but for now, we can ignore this for the sake of simplicity/illustration.

Perhaps my understanding of this is slightly outdated.

maybe... pip will install stuff in $PREFIX/lib/python3.X/ basically, but problems also show up depending

ngam commented 2 years ago

Just to be clear: One cannot have $PREFIX/lib/python3.X/ and $PREFIX/lib/python3.Y/ if X != Y. But if I understand your example about PyCall above correctly, that's completely fine in Julia Pkg.jl?

mkitti commented 2 years ago

I was referring to package versions above, and yes you can install multiple package versions simultaneously within the same depot.

I think you are talking about Python versions. The depot is designed such that multiple Julia versions can use the same depot. That said we are not setup here to have multiple Julia versions in the same conda environment, and our depots are entirely contained within the conda environment.

isuruf commented 2 years ago

If it does not find them, then it will try to download the latest code into the primary depot at the top of the stack.

Where's the primary depot?

Now there might be sharing between depots if another depot provides the exact same package at the exact same version. If it's the exact same package though, is there really a problem?

If two conda packages 'a' and 'b' provide the same depot SymbolicRegression.jl then there'll be clobbering and if b is uninstalled, a will stop working because SymbolicRegression.jl depot will be deleted when b is removed.

I have another question, where is the project being created? (AFAIK Depots in julia are like pkg_dirs in conda where conda packages are stored. A project in julia is like a conda environment where the conda pkgs are extracted to)

ngam commented 2 years ago

Where's the primary depot?

the one from julia-feedstock. (Right?)

If two conda packages 'a' and 'b' provide the same depot SymbolicRegression.jl then there'll be clobbering and

That's not problem for now (in the transition) but also generally can be remedied with creative naming. Fundamentally, this is not a blocking issue. The name "SymbolicRegression.jl" is arbitrary, we can even make it a unique hash if needed.

I have another question, where is the project being created? (AFAIK Depots in julia are like pkg_dirs in conda where conda packages are stored. A project in julia is like a conda environment where the conda pkgs are extracted to)

~We create and destroy the project, but we copy its content to the above-mentioned depot in share/julia/... so that we only package specific items (bare essentials) and not everything~ I will let mkitti answer this.

mkitti commented 2 years ago

Where's the primary depot?

The primary depot is in $PREFIX/share/julia per the julia-feedstock activate.sh

If two conda packages 'a' and 'b' provide the same depot SymbolicRegression.jl then there'll be clobbering and if b is uninstalled, a will stop working because SymbolicRegression.jl depot will be deleted when b is removed.

We're getting slightly confused here about the name of the depot, the name of the Julia package, and the name the conda-forge package.

I probably should have named the depot $PREFIX/share/pysr. I named it $PREFIX/share/SymbolicRegression.jl because the plan is to spin that out into its own conda-forge package named after SymbolicRegression.jl. The leading candidate for that name is julia-symbolicregression.

To be clear the depot $PREFIX/share/SymbolicRegression.jl/depot/packages contains the Julia source for 100 packages:

AbstractFFTs                    LLVMExtra_jll
AbstractTrees                   LogExpFunctions
Adapt                           LossFunctions
ArgCheck                        MacroTools
ArrayInterface                  Metatheory
ArrayInterfaceCore              MicroCollections
ArrayInterfaceStaticArrays      Missings
ArrayInterfaceStaticArraysCore  MultivariatePolynomials
AutoHashEquals                  MutableArithmetics
BangBang                        NaNMath
Baselet                         NLSolversBase
Bijections                      OpenSpecFun_jll
CEnum                           Optim
ChainRules                      OrderedCollections
ChainRulesCore                  Parameters
ChangesOfVariables              Parsers
ClusterManagers                 PositiveFactorizations
Combinatorics                   PreallocationTools
CommonSubexpressions            Preferences
Compat                          PyCall
CompositionsBase                RealDot
Conda                           RecipesBase
ConstructionBase                RecursiveArrayTools
DataAPI                         Reexport
DataStructures                  Referenceables
DataValueInterfaces             Requires
DefineSingletons                ReverseDiff
DiffResults                     Setfield
DiffRules                       SortingAlgorithms
DocStringExtensions             SpecialFunctions
DynamicPolynomials              SplittablesBase
ExprTools                       Static
FillArrays                      StaticArrays
FiniteDiff                      StaticArraysCore
ForwardDiff                     StatsAPI
FunctionWrappers                StatsBase
GPUArrays                       StructArrays
GPUArraysCore                   StructTypes
IfElse                          SymbolicRegression
InitialValues                   SymbolicUtils
InverseFunctions                Tables
IrrationalConstants             TableTraits
IRTools                         TermInterface
IteratorInterfaceExtensions     ThreadsX
JLLWrappers                     TimerOutputs
JSON                            Transducers
JSON3                           UnPack
LabelledArrays                  VersionParsing
LineSearches                    Zygote
LLVM                            ZygoteRules

Now let's say we have a theoretical conda-package conforming to BitInformation.jl or maybe julia-bitinformation. The Python interface is called xbitinfo and was recently packaged in conda-forge: https://github.com/conda-forge/xbitinfo-feedstock .

We could have a BitInformation.jl depot at $PREFIX/share/BitInformation.jl/depot or maybe $PREFIX/share/julia-bitinformation/depot. The packages subfolder contains all the dependencies that BitInformation.jl needs:

BitInformation       FillArrays               Preferences
Calculus             HypergeometricFunctions  QuadGK
ChainRulesCore       InverseFunctions         Reexport
ChangesOfVariables   IrrationalConstants      Rmath
Compat               JLLWrappers              Rmath_jll
DataAPI              LogExpFunctions          SortingAlgorithms
DataStructures       Missings                 SpecialFunctions
DensityInterface     NaNMath                  StatsAPI
Distributions        OpenSpecFun_jll          StatsBase
DocStringExtensions  OrderedCollections       StatsFuns
DualNumbers          PDMats

Both depots happen to contain code for FillArrays.jl, and we have potentially redundant installs of FillArrays.jl.

If either the SymbolicRegression.jl or the BitInformation.jl depot is uninstalled, the remaining one will continue to work. Each depot contains all the packages that the main package of the depot needs.

Two packages should not be providing depots of the same name because the depots should be closely related to the package that provides them. In this case pysr is a proxy for SymbolicRegression.jl and xbitinfo is a proxy for BitInformation.jl.

The depots are separate exactly to avoid clobbering each other. They can function independently or as part of the same stack.

I have another question, where is the project being created? (AFAIK Depots in julia are like pkg_dirs in conda where conda packages are stored. A project in julia is like a conda environment where the conda pkgs are extracted to.

In Julia there are "shared" environments which are stored in DEPOT_PATH[n]/environments/ and there are just regular directories wherever the user might want one. These usually only contain Project.toml and Manifest.toml.

This branch currently installs a Julia project in $PREFIX/share/SymbolicRegression.jl/depot/environments/pysr-0.10.1. Since it's within the depot the shorthand for this within Julia is @pysr-0.10.1. For example one may do the following.

julia> using Pkg

julia> pkg"activate @pysr-0.10.1"
  Activating project at `~/mambaforge/envs/pysr/share/SymbolicRegression.jl/depot/environments/pysr-0.10.1`

julia> pkg"status"
      Status `$PREFIX/share/SymbolicRegression.jl/depot/environments/pysr-0.10.1/Project.toml`
  [34f1f09b] ClusterManagers v0.4.2 `https://github.com/JuliaParallel/ClusterManagers.jl#14e7302`
  [438e738f] PyCall v1.94.1
  [8254be44] SymbolicRegression v0.10.1 `https://github.com/MilesCranmer/SymbolicRegression.jl#v0.10.1`

julia> Base.load_path()
3-element Vector{String}:
 "$PREFIX/share/SymbolicRegression.jl/depot/environments/pysr-0.10.1/Project.toml"
 "$PREFIX/share/julia/environments/pysr/Project.toml"
 "$PREFIX/share/julia/stdlib/v1.8"

Also note that the project folder only contains toml files:

$ ls $CONDA_PREFIX/share/SymbolicRegression.jl/depot/environments/pysr-0.10.1/
Manifest.toml  Project.toml

All packages are stored within the depot stack not within the project environment folders.

mkitti commented 2 years ago

Currently the activate.sh script on this branch will activate the Julia environment @pysr-0.10.1. However, the Python package pysr should probably set the environment variable JULIA_PROJECT to be @pysr-0.10.1.

ngam commented 2 years ago

I've seen enough, I think the issue with conflicting julia packages is resolved. Updated todo:

Points to address:

ngam commented 2 years ago

I probably should have named the depot $PREFIX/share/pysr. I named it $PREFIX/share/SymbolicRegression.jl because the plan is to spin that out into its own conda-forge package named after SymbolicRegression.jl. The leading candidate for that name is julia-symbolicregression.

I don't think we are in a place to move to packaging .jl stuff yet, but we could do so soon, so as it is, my vote is for $PREFIX/share/pysr since technically these are artifacts are to do with pysr specifically --- and more importantly that's the one name we know is unique since conda-forge doesn't allow duplicates names

ngam commented 2 years ago

All is easy except: ensuring no clashes between julia and python libs/pkgs and other $PREFIX/lib stuff

We basically can just trace these binaries and package them, and that's it?