JuliaLang / Pkg.jl

Pkg - Package manager for the Julia programming language
https://pkgdocs.julialang.org
Other
620 stars 260 forks source link

Compatibility with rusty environment chains #1965

Open thisrod opened 4 years ago

thisrod commented 4 years ago

Once upon a time, I installed Plots. I've been slack about updating it.

% julia -e 'using Pkg; pkg"status" ' | grep Plots
  [91a5bcdd] Plots v0.28.4

Recently, I created a package Superfluids, which depends on RecipesBase in the normal way. (Some irrelevant pkg"status" output has been elided.)

% pwd
/Users/rpolkinghorne/.julia/dev/Superfluids
% julia --project
               _
   _       _ _(_)_     |  Documentation: https://docs.julialang.org
  (_)     | (_) (_)    |
   _ _   _| |_  __ _   |  Type "?" for help, "]?" for Pkg help.
  | | | | | | |/ _` |  |
  | | |_| | | | (_| |  |  Version 1.5.0 (2020-08-01)
 _/ |\__'_|_|_|\__'_|  |  Official https://julialang.org/ release
|__/                   |

┌ Warning: Terminal not fully functional
└ @ Base client.jl:390
julia> using Pkg; pkg"status"
Project Superfluids v0.1.0
Status `~/.julia/dev/Superfluids/Project.toml`
  [3cdcf5f2] RecipesBase v1.0.2

This is all straight out of the manual, so it will just work, right?

julia> using Superfluids
[ Info: Precompiling Superfluids [01347403-e5cf-4a60-9350-9edf4e6960f2]

julia> using Plots
[ Info: Precompiling Plots [91a5bcdd-55d7-5caf-9e0b-520d859cae80]
ERROR: LoadError: LoadError: Unknown ColorScheme `:YlOrRd_r`. Check https://juliagraphics.github.io/ColorSchemes.jl/stable/ for available ColorSchemes.
... crash and burn ...

And Pkg knew that was going to happen:

julia> pkg"add Plots@0.28.4"
...

julia> pkg"status"
Project Superfluids v0.1.0
Status `~/.julia/dev/Superfluids/Project.toml`
  [91a5bcdd] Plots v0.28.4
  [3cdcf5f2] RecipesBase v0.7.0

A fix would have to involve Pkg checking for compatibility with the whole environment chain, and presumably adding packages to the active Manifest to shadow incompatible ones in other environments. I don't understand Pkg well enough to fill in the details.

StefanKarpinski commented 4 years ago

Without the contents of your Project and Manifest files, it's unclear (to me at least) what's going on here.

KristofferC commented 4 years ago

What is going on is that an old Plots version in the global v1.x environment loads a newer incompatible version of RecipesBase, which is installed in the current active environment.

StefanKarpinski commented 4 years ago

If you change or upgrade packages, things may break, so I'm unclear if there's anything to be fixed here?

KristofferC commented 4 years ago

If you change or upgrade packages, things may break, so I'm unclear if there's anything to be fixed here?

The argument is that Plots is declared incompatible with that version of RecipesBase that got loaded, but since Plots is loaded from a different environment (the global one) than its dependency (the local one), and the package manager only ensures that compatibility is satisfied within an environment, this can cause faulty configurations to load. What is desired is a warning or something that a package loads an incompatible dependency. I think this is pretty hard to do with the current system in place though. Best might be to just advice to run with LOAD_PATH = ["@"] if you don't want this to happen.

StefanKarpinski commented 4 years ago

The dependencies of earlier environments in the load path are always loaded intact, so your current project should never break because of this. Your development tools later in the load path might break because of something in the current project, which is what you're seeing here.

This is a code loading issue, and code loading cannot do things like add or remove different package versions. It loads code, that is all. If this kind of thing happens, the solution is to install a version of Plots that is compatible with the current project. A simple way to do that is to temporarily add it to the project, note the version that is installed and then add that specific version to your global environment.

I don't really get the "straight out of the manual" bit. What manual is this from? If you're reproducing an example, the way to do it is to use a manifest and install the exact versions of everything that the example used.

StefanKarpinski commented 4 years ago

One possible improvement is to record the compatibility constraints of each package in its manifest stanza. So the idea is that when we resolve a manifest, we look at all the things in the manifest that have compat constraints on package A and compute the intersection of all of those and record it in A's stanza. If the set of versions resolved is compatible, then the version of A that is chosen will be in this set. Then, when loading code, one still loads A from the first manifest it appears in and this version will be in the compat set for that manifest, but then we can look through all the later manifests in the load path as well, and if A appears in any of them, we can check if the version of A we loaded is in the compat set for that manifest. If it's not, then it's possible that something in that manifest that depends on A will be broken and we can print a warning about that, either when A is loaded or whenever something in that manifest that depends on A is loaded.

thisrod commented 4 years ago

I think this is an issue with package installation, not just code loading. When I told Pkg to add RecipesBase to the Superfluids package environment, it knew that Plots v0.28.4 was available to be loaded in that environment. It should have chosen a compatible version of RecipesBase.

If I did things the other way round, adding Plots to the outer environment after I added RecipesBase to the inner one, then it would be a lot harder for Pkg to detect the problem. Recording compatibility in manifests would allow it to be caught at code loading time, which would be a lot better than Plots failing to compile for no obvious reason.

What manual is this from?

Doing this is the entire point of RecipesBase. The PackageCompiler manual encourages you to compile Plots into your system image, which puts it in the same category as development tools.

StefanKarpinski commented 4 years ago

I really don't think that trying to make all the environments in the load path compatible is reasonable. I really don't want random tools I have installed in my global environment to affect the resolution of packages in the project I'm working on. If I try loading one of my dev tools and it doesn't work, then it's the dev tool I should mess with. Even if we did try to do this, you activate a different environment or modify the load path and, boom, possibly broken.

KristofferC commented 4 years ago

The PackageCompiler manual encourages you to compile Plots into your system image, which puts it in the same category as development tools.

PackageCompiler.jl is very explicit about the drawbacks of custom sysimages and that this is exactly one of the things you have to look out for.

thisrod commented 4 years ago

I really don't want random tools I have installed in my global environment to affect the resolution of packages in the project I'm working on.

No doubt that's the right thing for core developers who spend all day working on Julia. But maybe environments under ~/.julia/dev are a special case.

For those of us who work in Julia rather than on it (OK, we're a minority), and use the packages installed in our global environment to do our work, the right thing is for Pkg to resolve a usable environment. (And, ideally, it would be easy to understand the constraints on that environment and diagnose what's stopping it from being updated.)

Real use case: I have a "vortex dynamics paper" project environment, which necessarily includes Plots, either directly or through the global environment. This has a nested "supercomputer simulations" environment, which needs to stay compatible with the project, but should not install Plots every time it is instantiated on the supercomputer.

StefanKarpinski commented 4 years ago

No, I say that entirely as a developer of Julia projects: I do not want the versions of dependencies that my projects use affected by whatever happens to be in the rest of my load path. Each project should be resolved independently, anything else seems like madness. If global dev tools happen to be compatible, great; if not, I can tinker with them.

Let's say Pkg did what you're suggesting: I'm working on some project and I do julia --project and do a ] resolve to make sure that it's compatible with the full load path. Then I start Julia without the --project flag and add/remove/upgrade global packages. Since the project is not in my load path, Pkg picks versions of global packages that are incompatible with the project. Then I do julia --project again and Julia sees that the unchanged project which is now incompatible with the rest of the load path. What is supposed to happen? Refuse to run? Automatically re-resolve everything?

It sounds like for your real use case, what you need is a development tools environment that's associated with your project and resolved together with it, where you can have Plots as an optional dev dependency that gets resolved every time the main project dependencies get resolved, but is not installed when someone uses the main project. Note that this can already be accomplished by having a dev environment that depends on the core vortex dynamics environment. Then when you resolve the dev environment, it will pick compatible versions for the entire version graph. There's certainly tooling improvements that could be good to have and the concept of subprojects would potentially be very useful.

thisrod commented 4 years ago

What is supposed to happen? Refuse to run? Automatically re-resolve everything?

I like your idea of refusing to load packages that have become incompatible with the project.

Note that this can already be accomplished by having a dev environment that depends on the core vortex dynamics environment.

Or in my case, a vortex paper environment that depends on the supercomputer environment. Thanks, neat trick. I guess I can copy the vortex manifest to the supercomputer, then resolve the supercomputer environment to use the same versions as the paper one.

But I'm a bit confused. The supercomputer project environment doesn't have a name or a UUID, so what is there for the vortex environment to depend on?

StefanKarpinski commented 4 years ago

I think that people would be very annoyed if Julia refused to run a project because there was a hypothetical conflict with any dev tool that happens to be in the global environment. What if you don't currently care about plotting? Should Julia refuse to let your project load because you might want to plot something later? Note that the project itself is guaranteed to work, so why is it not allowed to run because some dev tool that you're not even using might not work? That seems fairly silly.

A better approach would be to refuse to load the conflicting dev tool. So, if something later in the load path conflicts with something earlier then Julia refuses to load it. That at least makes sense. Better still, since we don't actually know that it won't work just because the compat bounds claim this combination is untested, print a warning and let them at least give it a go. Which is what I originally suggested. A downside to loading with a warning is that once something is successfully loaded into the process, you can't unload it and get a different version whereas if it refuses to load, you have the option to try to fix the load path and try again.

But I'm a bit confused. The supercomputer project environment doesn't have a name or a UUID, so what is there for the vortex environment to depend on?

So give the project a name and a UUID then. Some additional support for this pattern would be a good idea. Several people have expressed desire for it; it really needs the subproject concept to be made to work smoothly. The idea is that each project could have a dev subproject that gets resolved along with the project and then we change the default load path to include the dev subproject of the active project after the active project itself.

thisrod commented 4 years ago

A better approach would be to refuse to load the conflicting dev tool.

That's what I meant. Sorry for the confusion.

The idea is that each project could have a dev subproject that gets resolved along with the project and then we change the default load path to include the dev subproject of the active project after the active project itself.

Here's another hack. I could maintain separate files for Project.toml and supercomputer/Project.toml, but symlink supercomputer/Manifest.toml to Manifest.toml. That would be fragile, but perhaps it could be made to work if every supercomputer dependency was also a project dependency.

Has anyone thought about making the subproject idea more symmetric? Instead of a hierarchy of subprojects, there could be a set of projects that share a common manifest, and resolve their dependencies to be compatible with the whole set. (Handling the intersections of those sets could get complicated.) Those sets would be independent of the loading path hierarchy, which might be simpler to implement.

thisrod commented 4 years ago

I might have a go at implementing project sets. From what I can see, the required changes would be: