Closed konstin closed 3 weeks ago
Is a significantly more complicated signature. This is a large next step toward DependencyProvider becoming an advanced user only interface. Although that may be an inevitable consequence of having production users.
We can also keep the DependencyProvider
more constraint, we only call add_incompatibility_from_dependencies
.
If it's relevant, i can do some benchmarking with this changed and with it rolled back in uv.
It is not obvious that this is the "right" abstraction for the situation. Would something else be more ergonomic for users? Would something else be more compatible with future improvements iterator?
I tried to abstract it into a trait and other tricks, but (my) rust is too limited to figure out something better.
I've reverted the biggest change to the dependency provider interface, we can also merge Dependencies<DependencyConstraints<String, SemVS>, Self::M>
into a single type again when we're aligned on this PR in general.
It is not obvious that this is the "right" abstraction for the situation. Would something else be more ergonomic for users? Would something else be more compatible with future improvements iterator?
I tried to abstract it into a trait and other tricks, but (my) rust is too limited to figure out something better.
Having ask you to make this PR and then had second thoughts while reviewing it with the second thoughts being abstract "does there exist" questions I owed you my full time and attention today. I went to a lot of harebrained ideas, most of which obviously wouldn't work without even looking at an editor. Eventually I had the idea of a DependencyAdder
which was a wrapper around &mut state
for the basic operations we want to stabilize. (Add one dependency, and iterator of dependencies, Mark this version is unavailable. That's all for now but we could add more later.) This didn't work because of type shenanigans. Then I thought about generalizing it so that it invoked a user provided callbacks for each stabilized method. Halfway through implement it, I realized I was hand rolling a vtable. So I tried making a trait. The result is https://github.com/pubgrub-rs/pubgrub/compare/dev...Eh2406:pubgrub:add_deps_take_2
What are people's thoughts? Should I open a PR for us to discuss that proposal?
struct DependencyAdder<'c, DP: DependencyProvider> {
state: &'c mut State<DP>,
package: &'c DP::P,
version: &'c DP::V,
err: Option<PubGrubError<DP>>,
}
Isn’t the fact that this struct holds a mut ref to the state going to be super annoying for implementations with async concurrency? Or is this structure always only short lived? (Sorry if my question doesn’t make sense)
Waiting on Konstin feedback on this.
Otherwise, I don’t remember exactly why we did not use impl IntoIterator
. I’m pretty sure we tried. Was it because of the simplicity tradeoff? or because we did it and benchmarked no perf improvement? or maybe because it was not well supported by the compiler at the time (we had a few decisions related to compiler support I recall)?
The &mut
data passing makes DependencyProvider
a lot more complex. I do like the add_incompatibility_from_dependency
a lot though, i have to check if that may be all we need, though i probably won't get to it today.
Isn’t the fact that this struct holds a mut ref to the state going to be super annoying for implementations with async concurrency? Or is this structure always only short lived?
It is short lived. So I don't expect trouble. But I don't maintain a async api, so not 100%. If it makes problems for UV, they can use Mutex<&mut State>
as they have their own version of resolve
already.
Otherwise, I don’t remember exactly why we did not use
impl IntoIterator
. ... maybe because it was not well supported by the compiler at the time (we had a few decisions related to compiler support I recall)?
impl trait
in trait only recently stabilized. So that is why it did not work when we last tried it. If you like the ergonomics of that API, we should merge this PR as originally proposed.
I think the lib is already too complex to be used without looking at docs. If we want people making useful things with pubgrub they should implement quite a lot themselve already so I don’t think this change would add too much barrier to entry.
I'm having a lot of thoughts, but I'm not sure what they are. I'm going to start writing in the hope that I can figure it out by the time I get to the end. If this ends up being incoherent I'm sorry.
There bunch of things I want the get_dependencies
API to have. But the world is not perfect and were going to have to compromise on some of them.
So let's start by evaluate the existing API: ) -> Result<Dependencies<DP::P, DP::VS, DP::M>, DP::Err>
.
Dependencies
(if we market nonexhaustive), but that's a really awkward way to add a new kind of edge. The new syntax would need to duplicate all the existing functionality, and add more functionality, and then we need to document why you would use one or the other.Next let's look at the original proposal from this PR. ) -> Result<Dependencies<impl IntoIterator<Item = (Self::P, Self::VS)> + Clone, Self::M>, Self::Err,>
.
impl Trait
, and Item =
(I don't even remember the name of that syntax).impl IntoIterator<Item = (Self::P, Self::VS)> + Clone
&
then yes. (.iter().map...
) If you have a Rc
or RefCell
then no.What if we add a new Enum for "future-proof". Basically combining this PR with #124. Something like ) -> Result<Dependencies<impl IntoIterator<Item = Requirement<Self::P, Self::VS>> + Clone, Self::M>, Self::Err,>
where Requirement
is Required | Constrained
.
&
then yes. (.iter().map...
) If you have a Rc
or RefCell
then no.Requirement
without breaking things. As long as what we are adding are "edges" were good.Having two different Enum is really stretching our simplicity budget. Perhaps we can combine them? Something like ) -> Result<impl IntoIterator<Item = Requirement<Self::P, Self::VS, Self::M>> + Clone, Self::Err,>
where Requirement
is Required | Constrained | Unavailable
.
&
then yes. (.iter().map...
) If you have a Rc
or RefCell
then no.Requirement
without breaking things. As long as what we are adding are "edges" were good.A note on the + Clone
, it's only really required for checking for self dependencies. We may be able to avoid it with .inspect
or some such. But I don't think it changes any of the fundamental argument.
Now let's evaluate my harebrained idea from yesterday, add_deps_take_2. dependency_adder: &mut DependencyVisitor<Self::P, Self::VS, Self::M>>) -> Result<(), Self::Err>
DependencyVisitor
without breaking anyone.So I've gotten to the end, and I still don't know what I was trying to say. It seems to have turned into a brainstorming about #148. One take away is that eventually we will probably grow a SimplifiedDependencyProvider
with an api like 0.2
that can be used if the full complexity of DependencyProvider
is not needed.
In the time I spent writing this Matthieu commented that the additional API complexity is not a critical blocker. So we should probably merge this PR with its original configuration. But we should do so recognizing that there are probably other changes that will be needed. This is not a "one-way door", it is totally okay to merge this and then make more changes.
One take away is that eventually we will probably grow a SimplifiedDependencyProvider with an api like 0.2
I’d think the same, so as long as it’s possible to rewrite a v0.2-like api from v0.3 I don’t think your point (2) will be an issue.
What if we add a new Enum for "future-proof". Basically combining this PR with https://github.com/pubgrub-rs/pubgrub/pull/124. Something like ) -> Result<Dependencies<impl IntoIterator<Item = Requirement<Self::P, Self::VS>> + Clone, Self::M>, Self::Err,> where Requirement is Required | Constrained.
I remember we discussed a lot about this. My point of view is that we should base this type of changes on the theory (incompatibilities). I think this type of change, enabling dependencies to provide more fine-grain type of incompatibilities, instead of just the shape {a, not b} should be approached holistically, not iteratively with enums variants appended one after the other.
But also, I’m also not worried about compatibility, because just like it’s possible to make v0.2-like wrappers when v0.3 is released, it can be exactly the same for v0.4 because this is a strict superset of capabilities.
I think this type of change, enabling dependencies to provide more fine-grain type of incompatibilities, instead of just the shape {a, not b} should be approached holistically, not iteratively with enums variants appended one after the other.
I don't quite follow. Trying to figure out what you had in mind. If we want to provide direct access to the solver, that will probably involve public access to "state" which is probably worth doing but its own headache. Even if we did that, we would still want DependencyProvider
to be quite flexible. So I don't think it helps me answer the question what API DependencyProvider
should offer. Taking your comment in different direction, any approach that succeeds at "future-proof" can always add a direct "raw incompatibility" version (either in a non-variant "Raw" or a .add_raw_incompatibility()
depending). Which seems like even more reason to figure out the future proofing story. Even if we provided such a raw API, I would want to have some amount of API that makes it clear what is "a pattern we know works" and what is "feel free to try it and see what happens". Perhaps by having named variance or named callbacks or by having the constructors of incompatibilities for the known patterns. So those are some thoughts I'm having. What did you have in mind by "approached holistically"?
What I mean by "holistically" is that all possibilities should be evaluated and added to the api at the same time. There is a finite number of incompatibility types that can be considered. {a}, {-a}, {a, b}, {a, -b}
etc. For all the incompatibilities with 1 element and two elements it’s important to give appropriate names. And for incompatibilities with more than two, I imagine it we can customize some patterns, like "only one positive" or "only one negative", otherwise fallback to asking people to enter raw incompatibilities.
What I’d like to avoid is saying ok now we enable {a, b} and {a, -b}, let’s see in the future if we need something else. Instead, it would make sense to just verify that with the past two years of optimizations we didn’t break a fundamental property that would prevent pubgrub from working with any incompatibility provided. And if ok, add these apis in a cohesive fashion.
Let me know if I’ve expressed my opinion better this time.
That makes a lot of sense. Let's get a full list of reasonable possibilities and name/document what they might be useful for before adding flexible APIs.
In office hours @konstin convinced me that avoiding the hash map construction is only an important performance optimization when the index data has been prefetched. If the data actually needs to be loaded from somewhere (whether that's network or disk) reading the data will be so much more expensive then constructing the map to make the optimization irrelevant. Ironically, the only place the optimization is visible is while doing benchmarking.
UV also rools its own resolve loop, and does not use dependency provider. so we agreed that the best next step was to change the API for State
and leave the API the same for dependency provider. we will discuss changes to dependency provider in follow-up PR's and after incorporating your holistic approach.
And while I was writing that comment it looks like the new revision is in.
To summarize, in the initial version of this PR we were changing DependencyProvider::get_dependencies
from returning a hashmap to return an impl IntoIterator<_>
to avoid the conversion cost from our dependencies Vec to the pubgrub's hashmap. But in uv, we're not actually using DependencyProvider
, we're driving our own solver loop and call add_incompatibility_from_dependencies
directly. In the merged version, we only change this internal api to take an iterator instead of a hashmap.
Personally, i don't think the current signature of DependencyProvider::get_dependencies
is fine. I'd prioritize releasing 0.3 and the new community engagement that comes with it. While the hashmap clone shows up in benchmarks, in practice you have to be far gone into microoptimizations for this to matter, i'm not even sure if it's noticeable in uv (and if you're that complex and that far, you want to run the solver loop yourself anyway, which is an entirely different interface). It would help a lot to get the changes we already made (associated type, better error handling, both the error type and the custom metadata, the massive speedups, etc.) out so we can get users reporting on how well the apis works for them.
Previously,
add_incompatibility_from_dependencies
was hardcoded to take a hash map. By relaxing this toimpl IntoIterator<_>
, we avoid converting from uv's orderedVec<(_, _)>
to aHashMap<_, _>
and avoid cloning.This intentionally does not change the public api of the
DependencyProvider
, only the internal interface that we are using in uv.