conda-forge / pysr-feedstock

A conda-smithy repository for pysr.
BSD 3-Clause "New" or "Revised" License
0 stars 6 forks source link

let's improve this recipe (and set a standard for python packages calling julia packages) #38

Closed ngam closed 2 years ago

ngam commented 2 years ago

Solution to issue cannot be found in the documentation.

Issue

This line shouldn't end like that and it useless; the package on anaconda.org doesn't have any of these crazy julia modules being installed: https://anaconda.org/conda-forge/pysr/files

https://github.com/conda-forge/pysr-feedstock/blob/cbf0bb3c7efa0262fe442a4f8495799fc1eb471c/recipe/meta.yaml#L14

If you really want && {{ PYTHON }} -c 'import pysr; pysr.install() you should add it in an activation script.

@ocefpaf, sorry I just saw your question on gitter. I am never there, and usually not very good at instant messaging πŸ˜… it is probably easier to get my attention to something via tag in an open pull request or issue.

Also tagging @isuruf if you have time to look into this with us.

The original message on gitter:

Filipe @ocefpaf Aug 06 10:55
Folks, what are the best practices for packaging a Python module that depends on a Julia pkg? @ngam I recall you packaging some Julia/Python stuff, do you have any advice for me?

Do you have a specific recipe/package? I am happy to work on it together in staged-recipes! Just tag me or respond here :)

Installed packages

n/a

Environment info

n/a
ngam commented 2 years ago

A little more history is maybe warranted here --- a story in three acts

But reverting stuff in staged-recipes never works... so we ended up with the wrong one from the first PR.

MilesCranmer commented 2 years ago

I agree it would be better if the pysr.install() ran as part of the conda install. Do you know how to implement that?

ngam commented 2 years ago

I think the only "acceptable" solution without violating standards and rules is to do via an activation script or a post-link script. Have you ever noticed how the EULA thing gets printed if you download nvidia packages from conda-forge? That's a post-link script. For Julia itself, we use an activation script to automatically create a julia env within the conda env.

ngam commented 2 years ago

But before moving on to a solution, let's wait a bit for input from others. I can help you implement these.

One worry I have about your case is internet access. I know you run this stuff on Princeton's offline compute nodes, so triggering something that needs internet upon conda activate env may not be a good option because that may results in errors on a compute node. So we should stick with the post-link trick.

Also, copying @mkitti who may find this interesting

mkitti commented 2 years ago

I'm watching, but I am still catching up and about to leave for a trip.

MilesCranmer commented 2 years ago

The post-link trick sounds like the best bet to me. Happy to try this, let me know when I can get started.

ngam commented 2 years ago

The post-link trick sounds like the best bet to me. Happy to try this, let me know when I can get started.

Whenever you want! Please tag me if you need help!

MilesCranmer commented 2 years ago

Hm, I'm reading the docs on it now and I'm not sure post-link.sh is the right strategy:

Post-link and pre-unlink scripts should:

  • Be avoided whenever possible.
  • Not touch anything other than the files being installed.
  • Not write anything to stdout or stderr, unless an error occurs.
    • Not depend on any installed or to-be-installed conda packages.
    • Depend only on simple system tools such as rm, cp, mv, and ln.

Specifically the last two points. I'm worried if trying to go this route will introduce a variety of other operating system-specific bugs to solve in the future... What do you think?

ngam commented 2 years ago

Thanks for looking into the docs. Let's try to ping @conda-forge/core for opinion. Generally, I only look at docs on conda-forge.org, but I don't see much about this topic.

jaimergp commented 2 years ago

Yes, I am not sure about it either. One of the main questions to answer is how deterministic the script can be. Users expect conda packages to be reproducible, so if it has dynamic behaviour, it better be a controlled one. I am not aware of what pysr.install() does, but if it has to find Julia dependencies, can we make sure they are always the same versions?

ngam commented 2 years ago

In other words, given a version of pysr = A, it should pull a certain julia version B, but also these julia packages must be pinned per version A.

I think we still need to think this through more thoroughly. I am not particularly happy with out solution so far, especially that it, in fact, does not work as expected #41

MilesCranmer commented 2 years ago

One thing we could do is fix the Manifest.toml file. Otherwise the Julia dependencies will be different every single install. These are bounded by the Project.toml file but those constraints are pretty soft rather than set to specific versions.

mkitti commented 2 years ago

Manifest.toml is good, but I'm not sure if that is a good thing for us to distribute. It would be similar to distributing a conda-lock file.

Perhaps it would be useful to tighten the version bounds further? https://pkgdocs.julialang.org/v1/compatibility/

MilesCranmer commented 2 years ago

I guess even if you tighten the version bounds on all the direct dependencies, the dependencies of those dependencies could still change their version. (and this sometimes could result in distinct behavior as a function of date the package is installed...). You could add [compat] bounds on the entire dependency tree, but then it seems is the same as the Manifest.toml file, no?

ocefpaf commented 2 years ago

I agree it would be better if the pysr.install() ran as part of the conda install. Do you know how to implement that?

I'm not so sure. I like the module.install(), at least for now, b/c that makes an easy transition to folks who wants to manually install Julia vs those who are packaging in systems like conda-forge, debian, etc. However, I have zero knowledge of Julia packaging and my thoughts here should be taken with a grain of salt.

mkitti commented 2 years ago

I guess even if you tighten the version bounds on all the direct dependencies, the dependencies of those dependencies could still change their version. (and this sometimes could result in distinct behavior as a function of date the package is installed...). You could add [compat] bounds on the entire dependency tree, but then it seems is the same as the Manifest.toml file, no?

It is useful to provide a Manifest.toml for specific applications. I would even say it would be good to publish a Manifest.toml representing combinations that you have tested. Adding it to the package root of your library, however, would make it would difficult for others to use your package in combination with other packages.

My suggestion would be to maintain a Manifest.toml on special git branch or label representing the tested combination of packages.

ngam commented 2 years ago

If you really want && {{ PYTHON }} -c 'import pysr; pysr.install() you should add it in an activation script.

I agree it would be better if the pysr.install() ran as part of the conda install. Do you know how to implement that?

I'm not so sure. I like the module.install(), at least for now, b/c that makes an easy transition to folks who wants to manually install Julia vs those who are packaging in systems like conda-forge, debian, etc. However, I have zero knowledge of Julia packaging and my thoughts here should be taken with a grain of salt.

I actually personally agree with @ocefpaf. Maybe my initial comment was misunderstood as an argument for pysr.install() being part of conda installation process: The emphasis is on "If you really want..."

The question is then, what would be the best way forward?

I think continuing to package these noarch python packages that call julia is fine, but we need to figure out a mechanism or guidelines for people to instruct their conda-forge users to install the project julia packages separately.

An alternative is to wait until we figure out the alternative to pip install . -vv something along the lines of:


...

 script: {{ PYTHON }} -m pip install . -vv' 
 script: {{ JULIA }} -c 'using Pkg; Pkg.install(".")' 

...

which will likely take time to come to fruition and will need a buy-in from the community. I am not sure there is enough love for Julia around to make this happen and upstream julia are definitely not interested in this type of mixing and matching ecosystems. They were quite unfriendly when I first got involved in the julia feedstock and perhaps understandably so --- why would they waste time on these side integrations if they have their own full solutions (Pkg, "speed", etc.) in-house, bundled in?

While conda-forge is definitely open to non-python stuff, we are still primarily a python-oriented community imo and python happens to love us as much as we love it (i.e. python makes our lives relatively easier to package stuff and there is simply no easy solution with Julia, at least not yet). So I don't see a significant buy-in from the community (or the main maintainers) for something as complex and fragile as this.

Sorry, I opened this with optimism about a neat solution, but now I am not so sure...

ocefpaf commented 2 years ago
 script: {{ PYTHON }} -m pip install . -vv' 
 script: {{ JULIA }} -c 'using Pkg; Pkg.install(".")' 

That would be the dream, right?

ngam commented 2 years ago

Yes because that would neatly take care of produced artifacts for us. We are quite far from it though if I understand things correctly πŸ˜…

ngam commented 2 years ago

people following, we have a working prototype in https://github.com/conda-forge/pysr-feedstock/pull/43#issuecomment-1231039957