Open KristofferC opened 4 months ago
Also cc @ChrisRackauckas who has thought about these stuff.
Trying this on Plots.jl we get resolver errors (as @oxinabox predicted) due to Scratch.jl being lower bounded to 1.0.0 while one of our dependencies (RelocatableFolders has a lower bound of 1.1):
Hm, shouldn't there be a way to work around that though? Like if we flip the situation and look at up
, if Plots.jl upper bounded Scratch.jl to 1.0.0
, and RelocatableFolders
upper bounded it to 1.1
, then the resolver would just install it at version 1.0.0
instead of erroring right?
I might be missing something, but I think it should be possible for down
to be fully symmetric with up
.
I might be missing something, but I think it should be possible for down to be fully symmetric with up.
Yes, but as I wrote in https://github.com/JuliaLang/Pkg.jl/issues/1062#issuecomment-1952096504, the whole reason to do this is to exactly test that your code works with the given lower bounds. So if you aren't hitting them exactly you are just testing some semi-random version configuration that have low versions "on average". Like in this case, if we install Scratch to 1.1, is the Scratch 1.0 lower bounds correct? You cannot really say anything about that but that is the actual question we wanted to answer by running our tests with down
.
I guess you could say that you could in theory minimize the version for one dependency at a time and run the tests for that but that would:
Ah okay, sorry I missed that comment, so this is a (nicer implementation of) the idea in https://github.com/julia-actions/julia-downgrade-compat
I'll say that I didn't really find downgrade-compat very useful for packages with many deps because
1) the complaints from the resolver when it fails are indecipheralbe and very hard to turn into something actionable if you're not an expert
2) If the lower bounds of your package isn't actually installable due to being un-resolvable, then I don't see why that's actually problematic. It's not a state a user would end up in and so it doesn't really need to be tested. What I do want to be able to test is the oldest set of compatible packages.
What I do want to be able to test is the oldest set of compatible packages.
Okay, I think this can be extended to do that. I will try...
Edit: Thinking about this some more, I think it requires basically a full resolver since there is for example not a unique answer to this question of the "oldest set of compatible packages".
Edit: Thinking about this some more, I think it requires basically a full resolver since there is for example not a unique answer to this question of the "oldest set of compatible packages".
Yeah, I was imagining the exact same behaviour as up
, but instead of seeking newer versions it seeks older versions. It won't catch everything since the set of versions isn't unique, but that seems "good enough" for trying to keep track of appropriate lower bounds, and much less annoying than having the resolver simply error out because of a incompatible set of lower bounds (at least currently, since it is so hard to know how to fix the resolver issues when the lower bounds aren't compatible)
Afterall, we also think it's okay that when I ]test
a package with a given set of upper bounds (either explicit or implicit), that we don't end up testing the unique newest set of packages, and it'd be really aggravating if the resolver threw an error just because it wasn't able to install the maximally newest compatible version of every dependency.
Yeah, I was imagining the exact same behaviour as up, but instead of seeking newer versions it seeks older versions.
Again, I am not sure that is useful in answering the question of if our lower bounds are correct. If you run the tests with this version of down
you might have the following situation:
[compat]
A = "0.5.1"
B = "0.7.1"
and you get when testing A=v0.5.1
and B=v0.7.3
. However, you could in another environment end up with another pareto optimal, say A=v0.5.3
and B=v0.7.1
(the resolver is free to give any pareto optimal resolution). So the fact that we have multiple pareto optimal means that our testing will be insufficient to say that our lower bounds are correct.
Afterall, we also think it's okay that when I ]test a package with a given set of upper bounds (either explicit or implicit), that we don't end up testing the unique newest set of packages,
I think this is different because when we are normally testing our package we are not explicitly testing that our upper bounds are correct because these are talking about an unknown future which can only be discussed from a "contract" p.o.v like semver. That's why I think that this is not symmetric.
For normal testing we are saying:
For down
testing we are saying:
That's fair.
I guess though I feel that having a ]down
that acts like ]up
at least seems like an easier path forward to getting people (imperfectly) testing their lower bounds than having something very strict and fussy.
If we do go with something as strict and fussy as pinning all the lower bounds, then I think nobody is going to use it unless we get much more informative and actionable error messages than we currently have, but making those error messages informative and actionable seems like a very difficult task. I don't know enough about this to have a strong opinion though.
I'm not sure if the error messages will be so bad. They will basically always be of the form that one of the direct dependencies limits another of the direct dependencies to something higher than the lower bounds that the package itself sets on it. For example in the Plots example the relevant stuff are:
ERROR: Unsatisfiable requirements detected for package Scratch [6c6a2e73]:
Scratch [6c6a2e73] log:
├─possible versions are: 1.0.0 - 1.2.1 or uninstalled
├─restricted to versions 1.0.0 by an explicit requirement, leaving only versions: 1.0.0
└─restricted by compatibility requirements with RelocatableFolders [05181044] to versions: 1.1.0 - 1.2.1 — no versions left
That said, maybe with this implementation it's not so critical that the error messages are good since the feedback loop is so much faster than with the downgrade-ci
. Looking back, the main reason I gave up after trying to use it was that I'd need to go and semi-blindly adjust my bounds, run the ci process, get a new error, adjust them again, get a new error, etc.
The process here would be fundamentally the same, but in practice not so painful because I just need the Project.toml
file open and a repl, and I can just tweak the bounds and continue hitting ]down
until I get something resolveable...
I'm not sure if the error messages will be so bad. They will basically always be of the form that one of the direct dependencies limits another of the direct dependencies to something higher than the lower bounds that the package itself sets on it. For example in the Plots example the relevant stuff are:
ERROR: Unsatisfiable requirements detected for package Scratch [6c6a2e73]: Scratch [6c6a2e73] log: ├─possible versions are: 1.0.0 - 1.2.1 or uninstalled ├─restricted to versions 1.0.0 by an explicit requirement, leaving only versions: 1.0.0 └─restricted by compatibility requirements with RelocatableFolders [05181044] to versions: 1.1.0 - 1.2.1 — no versions left
In my experience, I couldn't figure out what was going on. Sure, it's not so bad when direct dependancies start getting angry at eachother, but once you have nested indirect dependancies fighting, I get thoroughly confused. But as said above, maybe it's not so bad with this implementation.
In https://github.com/julia-actions/julia-downgrade-compat, why does it set ~
on the post 1.0 pacakge:
This action will modify the compat to:
[compat] julia = "1.6" Foo = "~1.2.3" Bar = "=0.1.2"
What if you require a bugfix in Foo 1.2.4 (so the Foo = "1.2.3"
entry is incorrect).
You can set which mode it uses with a command line arg: https://github.com/julia-actions/julia-downgrade-compat/blob/main/downgrade.jl#L41-L47, but yeah I can't comment on the specific choices made there.
Just my 2c: I would definitely expect down
to be just like up
, but in the opposite direction. That is, prefer oldest possible versions, but otherwise always resolve the env successfully – just like up
does.
More strict behavior could be enabled by a switch similar to the existing test(force_latest_compatible_version=true)
. This would make everything nice an consistent.
Ok you expect this but why would it be useful?
I guess that the name down
will (rightfully) generate the expectation of an opposite operation to up
. Maybe something like pin_down
?
but I think it should be possible for down to be fully symmetric with up.
That can't be true in general (unless I'm misunderstanding precisely what you mean). E.g.,
[compat]
PkgA = 1.2.3
Means I'm compatible with v1.2.4
, v1.2.5
, ..., v1.3.1
, ..., but I'm incompatible with v1.2.2
or anything lower. So, it's not symmetric, right?
I guess that the name down will (rightfully) generate the expectation of an opposite operation to up. Maybe something like pin_down?
Since this is intended to be used only for testing, why not make that explicit in the name? E.g., pkg> test down
, or pkg> downgrade test
?
Ok you expect this but why would it be useful?
Having a less fussy ]down
that gives a reasonable set of old versions I can test on without having to comb through and meticulously purge my Project.toml of incompatible lower bounds would give most of the testing utility of the more fussy ]bottom
or whatever, while wasting less of my time playing compat-bound whack-a-mole.
I definitely see the value of an ]bottom
functionality, but for the most part I'd honestly just rather use a ]down
that's symmetric with ]up
.
That can't be true in general (unless I'm misunderstanding precisely what you mean). E.g.,
[compat] PkgA = 1.2.3
Means I'm compatible with
v1.2.4
,v1.2.5
, ...,v1.3.1
, ..., but I'm incompatible withv1.2.2
or anything lower. So, it's not symmetric, right?
Where is the asymmetry you're pointing to? There's only a finite number of versions of any package in the general registry, so you can't end up with v1.2.∞
. That means there is some N
where the versions are bounded between v1.2.3
and v1.2.N
.
Kristoffer is right in that there's an asymmetry induced by the fact that new features can only appear in one direction, so it's good to have a function that demands the absolute oldest version set, whereas it makes less sense to demand the absolute newest version set, but even then I have some doubts.
You could say that it's important that you get the absolute newest version set if you want to test for potentially breaking changes in your dependencies. But we're okay with testing on just a reasonably new set of compatible packages, so I'd also be okay with testing on a reasonably old set.
Having a less fussy ]down that gives a reasonable set of old versions I can test on without having to comb through and meticulously purge my Project.toml of incompatible lower bounds would give most of the testing utility..
I disagree, you would have failed to verify your lower bounds so even if your tests pass your lower bounds can still be buggy and give runtime errors for users (which is what you wanted to avoid in the first place).
Yes, but perfect can be the enemy of good.
There are also other positions in the compat interval not tested.
Like if you declare compat for 1,2,3
but actually v2 removed a feature that v1 had and that v3 brought back.
Testing both top and bottom isn't going to be enough.
Testing (potential one of multiple) lowest possible version that can be achieved in practice -- that a user might actually see, seems like a good point in the trade-off space of time spent chasing dependencies (which becomes zero as it isn't required to transcribe stuff from your dependency's lower bounded lower bounds wherever higher than your own) vs coverage of potential mistakes and problems.
Very quick implementation of https://github.com/JuliaLang/Pkg.jl/issues/1062#issuecomment-1952014608, (with some debug logging in it to show what happens), specifically:
An example of its usage:
Trying this on Plots.jl we get resolver errors (as @oxinabox predicted) due to Scratch.jl being lower bounded to 1.0.0 while one of our dependencies (RelocatableFolders has a lower bound of 1.1):
cc @carlobaldassi, @oxinabox