Closed mpizenberg closed 2 years ago
It is a big change, I could use some pointers on where to start looking this over. What are the core new ideas? What advantages do they bring?
The core idea of this API design attempt is to relax the Version
trait enough to enable more features but still avoid making Range
a user-implemented trait. Instead I went for a middle-ground of adding an Interval
trait that has to be provided by the user. (PS for the time being I kept the Range
name but would like to rename it to Ranges
plural).
So the traits to implement by the user are the following.
pub trait Version: Clone + Ord + Debug + Display {
/// Returns the minimum version.
fn minimum() -> Bound<Self>;
/// Returns the maximum version.
fn maximum() -> Bound<Self>;
}
/// An interval is a bounded domain containing all values
/// between its starting and ending bounds.
pub trait Interval<V>: RangeBounds<V> + Debug + Clone + Eq + PartialEq {
/// Create an interval from its starting and ending bounds.
/// It's the caller responsability to order them correctly.
fn new(start_bound: Bound<V>, end_bound: Bound<V>) -> Self;
}
The Interval
type extends std::ops::RangeBounds
so the user has to implement that trait as well.
pub trait RangeBounds<T> {
pub fn start_bound(&self) -> Bound<&T>;
pub fn end_bound(&self) -> Bound<&T>;
pub fn contains<U>(&self, item: &U) -> bool { ... }
This means giving the start and the end of the interval as bounds (potentially infinite). And a contains
method is provided with a default implementation (check if within bounds) but can be overridden. For example, for semantic versions, one could check if a bound contains a pre-release marker to alter the contains
implementation.
But the cool change is that the Range
type is still handled by ourselves. It is defined as follows.
pub struct Range<I, V> {
segments: SmallVec<I>,
phantom: PhantomData<V>,
}
So basically the same as before with the exception that it now has two type variables, one for Interval
, one for Version
. I could not figure out if I could just have the I
and not the V
because it was giving me compiler issues in the impl block. (ooohh I now realize that V
could be an associate type of Interval
instead of a type variable ... I changed all types everywhere ...)
impl<I: Interval<V>, V: Version> Range<I, V> {
...
}
Since Version
is much less strict than before, it enables things like continuous types since there is no more bump
(which pre-releases kind-of implies) and infinite double bounds, not only on the right side. And in our Range
implementation, the contains
method is implemented as before, except that instead of returning true
when a version is inside one of the intervals, it delegates a call to Interval::contains
and the user might decide that it does not contains that version (e.g. pre-releases if bounds do not contains pre-releases).
The downside is that the user cannot override the behavior for versions outside of a given interval, but I don't see that being a desirable property. The user also do not see the full Range
when answering for contains
, only one of its Interval
so it would not be possible to introduce faithfully things like "no pre-release except if that's the only version satisfying a given constraint" (in PEP 440), because it would not know if potentially another segment in the range might contain a non-pre-release version. I think that would be an extremely rare case anyway.
But the huge advantage is that we keep the control of Range
and its quite hard to implement functions like intersection
and complement
. Proof is I introduced a bug when rewriting it XD and now need to chase it ^^.
To have a look at what's new, you can just look at the range_trait
(misnamed) and version_trait
modules. The rest of the changes are just type variables manipulations.
I think this will be cleaner with a associated type for V
. I will do a closer review when tests are passing. Then try to see how to support the Semver Crate with this approach.
Thanks for the explanation! I had a few minutes to look this over. It is definitely an intriguing approach!
I am not seeing how to construct a Range
that represents something like Semvers >=2.0.0-alfa
, witch is "(non-pre versions >= 2.0.0) union (pre versions >= 2.0.0-alfa and < 2.0.1)". I can see how to make an Interval
that stores whether its contains
matches a pre
, but Range
only constructs I
using I::new
so I don't see how to get a Range
that has some Intervals that do and some that don't.
I am not seeing how to construct a Range that represents something like Semvers >=2.0.0-alfa
Well if your semver type is something like
impl SemVer {
fn new(major: u32, minor: u32, patch: u32, pre: Option<String>) -> Self { ... }
}
Then you'd just call
SemInterval::new(Bound::Included(SemVer::new(2, 0, 0, Some("alpha"))), Bound::Unbounded)
And within the SemInterval
implementation, you can override the contains method to check for a pre-release in bound. Something like
impl RangeBounds<SemVer> for SemInterval {
fn start_bound(&self) -> Bound<&SemVer> { ... }
fn end_bound(&self) -> Bound<&SemVer> { ... }
// this one we override the default implementation provided by `RangeBounds`
fn contains(&self, item: &SemVer) -> bool {
// 2.0.0-beta
if item.is_prerelease() {
// 2.0.0-alpha 2.0.0-alpha, 2.0.0
self.start.is_prerelease() && item.within_bounds(self.start, self.start.without_prerelease())
// 4.5.3-beta, 4.5.3-alpha, 4.5.3-beta, 4.5.3 4.5.3-beta
|| self.end.is_prerelease() && item.lower_than(self.end) && item.without_prerelease().higher_than(self.end)
} else {
item.within_bounds(self.start, self.end)
}
}
}
It's more or less pseudo code so take that with a grain of salt as I've not thought through it all, just writing stuff as it comes to mind right now.
PS, I've edited the above example code for contains()
at least a couple times so don't trust what the email you received says XD
The above code example might be horribly wrong, the important notion is that "if we can explain in english with simple rules if a version (pre-release, or not) is included in an interval by just looking at the interval bounds, then we can do it in code too"
but Range only constructs I using I::new so I don't see how to get a Range that has some Intervals that do and some that don't.
Oh maybe I misinterpret the question. So the thing is that Range
knows each interval bound. So the Range
can check bounds of its stored intervals (which are non-intersecting) and once it reaches one that should contain the given version, instead of returning true
it returns that_interval.contains(that_version)
.
See for example here: https://github.com/pubgrub-rs/pubgrub/blob/948d2c4ac3c862bca47f66b16e31d0ed376e9645/src/range_trait.rs#L293
I should definitely add some ascii segments drawing for each branch of that Range.contains()
method.
That all makes sense except for one small part. When I call SemInterval::new(Bound::Included(SemVer::new(2, 0, 0, Some("alpha"))), Bound::Unbounded)
I get a SemInterval
but I need a Range
to put in my DependencyConstraints
.
That all makes sense except for one small part. When I call
SemInterval::new(Bound::Included(SemVer::new(2, 0, 0, Some("alpha"))), Bound::Unbounded)
I get aSemInterval
but I need aRange
to put in myDependencyConstraints
.
Well we expose the same API as before for Range
which will create those intervals internally. So for example
/// Set of all versions comprised between two given versions.
/// The lower bound is included and the higher bound excluded.
/// `v1 <= v < v2`.
pub fn between(v1: impl Into<V>, v2: impl Into<V>) -> Self {
let start_bound = Bound::Included(v1.into());
let end_bound = Bound::Excluded(v2.into());
Self {
segments: SmallVec::one(I::new(start_bound, end_bound)),
phantom: PhantomData,
}
}
That's why I needed to add the new()
function the the Interval
trait. I didn't yet found an alternative but maybe there is a smarter way to do that.
That's what you'll find here: https://github.com/pubgrub-rs/pubgrub/blob/948d2c4ac3c862bca47f66b16e31d0ed376e9645/src/range_trait.rs#L111
The above code example might be horribly wrong, the important notion is that "if we can explain in english with simple rules if a version (pre-release, or not) is included in an interval by just looking at the interval bounds, then we can do it in code too"
It took a few readings before this sank in. But now that I get it, it sounds worth a try!
It took a few readings before this sank in. But now that I get it, it sounds worth a try!
If the VersionReq
type in the semver crate is what I believe it is, contains
(Interval::contains
not Range::contains
) could literally be implemented by re-using their matches
method. So something in the like of
self.into_version_req().matches(version)
Of course we could just rewrite the content of that function, that would be more efficient and simpler than doing the into_version_req
conversion.
So one thing we are going to need to be careful of is that range.contains(v) == false
does not mean that range.complement().contains(v) == true
. I am not sure all the places this may bight us. One that I have thought of is:
foo="<5.0.0"
2.0.0-alfa
but it is not compatible with this Reqfoo="<2.0.0-beta"
. If we are not careful we will decided that because 2.0.0-beta
< 5.0.0
and there are no versions in "<5.0.0"
then there are no version in "<2.0.0-beta"
Hum, very true ... Breaking the strict meaning of contains
in an asymmetric manner might cause issues with backtracking and saved NoVersion
incompatibilities.
So one thing we are going to need to be careful of is that
range.contains(v) == false
does not mean thatrange.complement().contains(v) == true
. I am not sure all the places this may bight us. One that I have thought of is:1. We have a dependency `foo="<5.0.0"` 2. We discover that there are no matching versions; unbenounced to us there is a `2.0.0-alfa` but it is not compatible with this Req 3. We backtrack 4. We find a new dependency `foo="<2.0.0-beta"`. If we are not careful we will decided that because `2.0.0-beta` < `5.0.0` and there are no versions in `"<5.0.0"` then there are no version in `"<2.0.0-beta"`
Do we know how dart pub handles such a situation?
Do we know how dart pub handles such a situation?
So I cloned pub and added the following test:
test('backtracking to pre-release after NoVersions', () async {
await servePackages((builder) {
builder.serve('a', '1.1.0', deps: {'b': '^1.0.0'});
builder.serve('b', '1.1.0-alpha');
builder.serve('a', '1.0.0', deps: {'b': '^1.1.0-alpha'});
});
await d.appDir({
'a': '^1.0.0',
}).create();
await expectResolves(result: {
'a': '1.1.0',
'b': '1.1.0-alpha',
});
});
And that test is passing. Meaning they accept b 1.1.0-alpha
even though a 1.1.0
asked for b ^1.0.0
so without pre-release. I suppose they do that to never generate a NoVersion
incompat if only viable versions are pre-releases? Or maybe it's for another reason. Might be worth asking them?
Hum, however, if I go one dependency deeper to force a backtracking to happen, it seems to make pub solver fail!
test('backtracking to pre-release after NoVersions', () async {
await servePackages((builder) {
builder.serve('a', '1.1.0', deps: {'b': '^1.0.0'});
builder.serve('b', '1.0.0', deps: {'c': '^1.0.0'});
builder.serve('c', '0.9.0');
builder.serve('b', '1.1.0-alpha');
builder.serve('a', '1.0.0', deps: {'b': '^1.1.0-alpha'});
});
await d.appDir({
'a': '^1.0.0',
}).create();
await expectResolves();
// await expectResolves(result: {
// 'a': '1.1.0',
// 'b': '1.1.0-alpha',
// });
});
This test fails even though I'm just asking it expectResolves()
which should pass if there is any solution. Which we know there are. Because a 1.0.0, b 1.1.0-alpha
is a valid solution, or even a 1.1.0, b 1.1.0-alpha
if it behaved like the other example above with one less dependency deep.
I've created a PR on pub repo with the failing test asking for their thoughts: https://github.com/dart-lang/pub/pull/3038
Do we know how dart pub handles such a situation?
I was going to do more research but I thought the answer was "by having more Range like rules around when pre releases match", looks like you did the research and found the real answer is "with bugs".
Don't bother looking at those new commits. I'm trying stuff to figure out what is wrong with the intervals. For now, I've added logging, which lead me to discover that one assignment intersection was wrong at some point. Leading me to the fact that there is a problem with intersection
since if I set
r1
to < 1 or > 1
(basically the complement of the singleton with just 1
)r2
to >= 1, < 2
(the range 1..2
)Then we have r1.intersection(r2)
that is correct, returning > 1, < 2
but r2.intersection(r1)
that is incorrect, returning >= 1, < 2
.
Typically, this should have been spotted by the intersection_is_symmetric
property test, but it's not, even if I increase very much the number of proptest cases. So I've been trying to improve the strategy for generation of ranges with bounds that may take all possible shapes, but without success for now.
I've temporarily deactivated the serde feature and all tests (except with serde) are passing :)
I didn't found a way to trigger the symmetric intersection tests though, so I just fixed the bug in intersection
:(. I didn't know it could be possible to be unhappy about fixing a long standing bug but that's what I feel after failing to trigger it with property tests ahah.
Before evaluating how much of a performance drain is the usage of bounds, I thus need to fix all the code related to the serde feature, to be able to run the benchmarks. I'll probably also have to re-encode the ron files. I'm not sure how I'll do that but will see.
The code in this branch will also need a decent amount of cleaning. And while evaluating perf, verifying that the log::
usage does not impact performances.
If there is a bug in intersection, let's make the fix a separate PR so we can get it out there ASAP.
Nope no worry, there was a bug in my reimplementation of intersection for the Interval API in this PR. The main one that is released is fine.
On Sun, Aug 8, 2021, 16:05 Jacob Finkelman @.***> wrote:
If there is a bug in intersection, let's make the fix a separate PR so we can get it out there ASAP.
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/pubgrub-rs/pubgrub/pull/99#issuecomment-894802948, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAWFOCLGP3OO522HKD5FHSDT32FJJANCNFSM47EEHCYA .
This has been superseded by #112
Tests are not passing yet. It is quite a big change so some bugs were inserted (not on purpose ^^)