FR: Warn users of empty revsets when using multiple child/parent operators. (EDITED)

KingMob commented 1 month ago

Is your feature request related to a problem? Please describe. I ran into a surprising issue when I tried to log more revisions than existed in a repo. I ran something like jj log -r '@-------------------::@', and jj silently outputted nothing. Conceptually, I was thinking of this as "go back a dozen ancestors from @", and not "locate the dozenth ancestor of @, and give me its descendants down to @". I was expecting it to display "root():@" if I went too far.

Regardless, a silent failure is not ideal.

Describe the solution you'd like jj should issue a warning if the revset is invalid. Even better, if the revset can be properly adjusted to the nearest root/head, that would be great, though I haven't thought through all the cases to know if that's feasible.

Describe alternatives you've considered I mean, "root()::@" worked, but I didn't do that initially, because I assumed the repo history would be too long.

yuja commented 1 month ago

Maybe you know, but just to be clear, an empty expression (such as none()::) isn't invalid.

Anyway, the easiest workaround is to let jj log issue a warning if the resulting set is empty. I'm not sure if we want to see a warning for author(unknown_user), but it wouldn't be that annoying.

We can also add a revset function that asserts the size (e.g. non_empty(@---)), but explicit expression wouldn't help in this case.

Alternatively, we can make some expressions require non-empty set by default, and add an explicit opt-out. For example, none():: could be error by default, and user would have to write none()?:: to accept empty operands. Implementation-wise, this isn't simple, but a harder problem is to decide which operator/function should belong to this category.

KingMob commented 1 month ago

I'm not advocating for a warning if a revset's empty, just a warning if it's invalid (for some agreed-on definition of invalid).

To me, none():: and author(unknown_user) are always valid even if empty, because (1) an empty set has no descendants, and (2) an author search might return no results.

But if @------------------------ doesn't point to a rev, that seems "invalid" when empty. My intention here is usually "not enough history, show me more", and I expect adding another dash to show me more, not quietly exit, if that makes sense. When using @------------------------, I think an empty set should just be treated differently based on user intents.

KingMob commented 1 month ago

Alternatively, an "optimistic" child/parent operator might be nice, one that attempts to extend if possible, but simply stops expanding when it runs out of revs.

That's what I really want when I type something like @------------------------, I don't care about the precise 24th ancestor, I just want to go back a while in history so I can scan for something.

I could also reliably do something like main..@++++ without caring about where I left the @, because my intent is something like "show me the branch and anything immediately downstream of where I'm at". If I have to be careful that @++++ always matches, then I end up running log multiple times to see what I want.

bnjmnt4n commented 1 month ago

I'm not advocating for a warning if a revset's empty, just a warning if it's invalid (for some agreed-on definition of invalid).

I think it's hard to classify what is invalid in this case.

To me, none():: and author(unknown_user) are always valid even if empty, because (1) an empty set has no descendants, and (2) an author search might return no results.

@---... is in some sense similar to none():::

none():: is getting all the descendants of none()
@---... is sequentially getting the parents of each generation of parents before, up to @

If you're willing to accept that an empty set has no descendants, then the empty set should also have no parents/ancestors either.

Anyway, the easiest workaround is to let jj log issue a warning if the resulting set is empty. I'm not sure if we want to see a warning for author(unknown_user), but it wouldn't be that annoying.

I think this is probably the easiest workaround. Perhaps the warning wouldn't be too annoying.

KingMob commented 1 month ago

I think it's hard to classify what is invalid in this case.

Perhaps "invalid" is the wrong word to use here. In set theory terms you're correct, but I think the intents of how people use various operators matters, too.

Maybe I'm the only one, but I suspect many newcomers have similar mental models.

I'll update the title, if that helps.

yuja commented 1 month ago

Alternatively, an "optimistic" child/parent operator might be nice, one that attempts to extend if possible, but simply stops expanding when it runs out of revs.

"runs out" state is a bit fuzzy to define. x-/x+ can resolve to multiple revisions at merge/fork point, and one of them can short early by repeating x--... Tracking them independently wouldn't be worth the effort, but it seems weird if x------- is clamped by root() only when it gets empty.

yuja commented 1 month ago

fwiw, maybe you want to do jj log --limit N -r::@ instead of @------...::@?

KingMob commented 1 month ago

"runs out" state is a bit fuzzy to define. x-/x+ can resolve to multiple revisions at merge/fork point, and one of them can short early by repeating x--...

I'm not sure I see the issue. Can you give me an example?

it seems weird if x------- is clamped by root() only when it gets empty.

I'm not seeing this. By definition, the dash right before it becomes empty will always be root, yes? It's the last valid rev seen. I guess it just doesn't seem that weird to me.

For the children operator, it's different, since branches will be of different lengths. But that could be useful, too. You could say "show me up to 20 revs or less for each branch from this point". (I admit I don't need that, but someone might.)

fwiw, maybe you want to do jj log --limit N -r::@ instead of @------...::@?

That would work for ::@ (as you describe), but not necessarily for my optional main..@++++ example. It depends on the direction limit selects. (And even if it works here, there's a counterexample using the other direction where limit fails.)

Maybe clamping warrants new operators, but I think they'd be useful.

(They'd would break symmetry, but I don't think anyone who uses clamping variants will be surprised. E.g., with clamping, x---+++ might not always equal x, but that's acceptable.)

ilyagr commented 1 month ago

I think I'd be OK with considering root()- to be an exceptional operation that deserves a warning whenever it's evaluated. One can easily expect x----::x to be the same as ancestors(x,4) and be really surprised when they differ. I find it difficult to imagine a situation when somebody would evaluate root()- and mean it.

The warning could say something like:

Warning: the revset ... ended up computing root()- as an intermediate result. root()- resolves to the empty set.

This is not crucial, if people are really opposed to case-by-case solutions, but I think it'd make jj just a bit friendlier. It's also quite possible that this is more difficult to implement than it is worth, I haven't checked.

Having a log print something if the resulting set is empty seems orthogonal. It'd still be surprising why it's empty.

yuja commented 1 month ago

"runs out" state is a bit fuzzy to define. x-/x+ can resolve to multiple revisions at merge/fork point, and one of them can short early by repeating x--...

I'm not sure I see the issue. Can you give me an example?

it seems weird if x------- is clamped by root() only when it gets empty.

I'm not seeing this. By definition, the dash right before it becomes empty will always be root, yes? It's the last valid rev seen. I guess it just doesn't seem that weird to me.

I was wondering why degree == 0 is special (whereas degree > 1 isn't?) Let's consider the following history:

F
|\
| E
| D
B C
|/
A
|
root()

F-: B|E
F--: A|D
F---: root()|C

If each parent is tracked individually and individual emptiness is special cased (or root() had a hidden loop edge),

F----: root()|A

If empty set is special cased,

F----: A

For the children operator, it's different, since branches will be of different lengths. But that could be useful, too. You could say "show me up to 20 revs or less for each branch from this point". (I admit I don't need that, but someone might.)

fwiw, maybe you want to do jj log --limit N -r::@ instead of @------...::@?

That would work for ::@ (as you describe), but not necessarily for my optional main..@++++ example.

Maybe it's the job of latest() and hypothetical earliest() functions? They aren't equivalent to DAG operations, but the use case matches.

ilyagr commented 1 month ago

I was wondering why degree == 0 is special (whereas degree > 1 isn't?)

I didn't follow your conversation entirely, but degree >1 parents do not break the equivalence of x----::x and ancestors(x,4). To me, what seems special and a bit unnatural is the idea that root() has no parents.

You refer to something similar when you mention the possibility of root() having a "hidden loop edge", but loop edges break the invariant that, well, there are no loop edges in the commit graph.

From this perspective, the ideal model would be to have infinitely many root commits, all empty:

  A <- root() <- root1() <- root2() <- .....

(but actually implementing that seems like overkill to me. Maybe we can do it for some April Fools day.)

yuja commented 1 month ago

I think I'd be OK with considering root()- to be an exceptional operation that deserves a warning whenever it's evaluated. One can easily expect x----::x to be the same as ancestors(x,4) and be really surprised when they differ. I find it difficult to imagine a situation when somebody would evaluate root()- and mean it.

In range expression, clamped version might be useful. However, I would be surprised if jj show x---... returned clamped(N)-th parent.

Implementation-wise, adding a "checked" version of range operation (that rejects empty operands) is probably easier than checking if each sub-expression is empty to emit warning.

yuja commented 1 month ago

degree >1 parents do not break the equivalence of x----::x and ancestors(x,4). To me, what seems special and a bit unnatural is the idea that root() has no parents.

Only in range expression, right?

You refer to something similar when you mention the possibility of root() having a "hidden loop edge", but loop edges break the invariant that, well, there are no loop edges in the commit graph.

From this perspective, the ideal model would be to have infinitely many root commits, all empty:
  A <- root() <- root1() <- root2() <- .....

Correct. It's a better concept to model clamped children/parents.

ilyagr commented 1 month ago

Only in range expression, right?

In range expression, clamped version might be useful. However, I would be surprised if jj show x---... returned clamped(N)-th parent.

It's a better concept to model clamped children/parents.

I didn't quite understand what you mean, could you elaborate a bit?

Implementation-wise, adding a "checked" version of range operation (that rejects empty operands) is probably easier than checking if each sub-expression is empty to emit warning.

The idea of a "checked range operation" is interesting.

Just to make the parallel clear, what I suggested is essentially a "checked - operation" that checks if the operand is root(), mainly motivated by https://github.com/martinvonz/jj/issues/3821#issuecomment-2148766967.

... easier than checking if each sub-expression is empty to emit warning.

I don't quite follow the "checking if each sub-expression is empty to emit warning" idea, since empty expressions seem to make a lot of sense in some cases. I think you don't like it either, but let me know if there is something important I missed.

yuja commented 1 month ago

Only in range expression, right?

In range expression, clamped version might be useful. However, I would be surprised if jj show x---... returned clamped(N)-th parent.

Here I mean A|root() is different from root(), though(A|root()):: is equivalent to root():: as you said. And I wouldn't expect jj show @-- returns something if jj show @- is the root.

It's a better concept to model clamped children/parents.

I didn't quite understand what you mean, could you elaborate a bit?

I just mean your explanation is better than "hidden loop edge".

... easier than checking if each sub-expression is empty to emit warning.

I don't quite follow the "checking if each sub-expression is empty to emit warning" idea, since empty expressions seem to make a lot of sense in some cases. I think you don't like it either, but let me know if there is something important I missed.

We'll need to inspect sub expression x- in (x----::y)&.. and/or propagate warning from there. That would require more code than adding checked_dag_range(x, y)&.. that errors out.

arxanas commented 1 month ago

There's an intuitive interpretation that doesn't involve special "clamping" behavior:

fixed_point(f, x, n) =
  if n == 0
  then x
  else fixed_point(f, x | f(x), n - 1)

minus(x, n) = roots(fixed_point(parents, x, n))

Whereas it currently behaves like this:

iter(f, x, n) =
  if n == 0
  then x
  else iter(f, f(x), n - 1)

-(x, n) = iter(parents, x, n)

Observe that both f = minus and f = - satisfy this property:

forall x a b.
  a <= b implies
    ::f(x, a) >= ::f(x, b)

I believe it's some kind of homomorphism. The structure is not violated, which is good, but the operator is >= rather than <=, and the :: is on the "wrong" side, so it doesn't tell us much other than that both operators behave "sanely" in some sense.

Whereas only f = minus satisfies this property:

forall x a b.
  a <= b implies
    f(x, a):: <= f(x, b)::

This is a monotonic property, indicating that not only was the structure "preserved", but no information was "lost". (There must be a duality between homomorphic and monotonic operations that I'm not aware of?) Monotonic operators are quite common in declarative systems like these.

I believe minus is monotonic because it's the composition of two monotonic functions:

roots is monotonic according to both of the descendant relations above.
fixed_point "launders" any function to be monotonic according to the subset relation (more general than the descendant relation).

But - is not monotonic in the same way because parents is not monotonic (because it can "lose" parents eventually).

I think the question is: do people think that present-day x--- is the same as hypothetical minus(x, 3)?

At the very least, the decrement operator on the integers is certainly monotonic 😉. I'm guessing users might be surprised because they expect this operation to be monotonic in a more overt fashion.

It's also worth considering why root() exists to begin with. It ensures that the least upper bound of any two commits always exists, i.e. the commit graph is a "join-semilattice". Two questions:

Is it weird that we ensure that joins always exist but not meets (greatest lower bound)?
- I don't think so. In some sense, jj improves on the state of the art by ensuring that a meet can always be calculated, unlike Git.
- Possibly Pijul has worked it out so that the patch algebra has exactly one canonical merge for any set of patches, in which case they have actually achieved the full lattice structure.
- But perhaps we have to ask why + doesn't always produce meets when - can always produce joins.
- Of course, today, we already restrict all operations to some "visible" subset of the commit graph, so it's not overly weird.
Is it relevant to the monotonic discussion above?
- I think so. The join operation itself is a monotonic operation precisely because joins always exist.
- Furthermore, roots is a monotonic operation.
- So I suspect that there's a natural attempt to assume that many related operations are monotonic in some sense, like minus.

In my experience, many similar declarative systems operate in terms of fixed points and monotonicity, so "more" monotonic is generally better. (forall f g. monotonicity(f) < monotonicity(g) => goodness(f) < goodness(g)? 🤔)

KingMob commented 1 month ago

FWIW, I can confirm that my mental model of - prior to this issue was more akin to @arxanas's minus than the actual - behavior. I envisioned it as growing a set, when possible, and then taking the roots at the end.

yuja commented 1 month ago

I still think it's context dependent whether the "minus" or the current "-" behavior is preferred. It makes some sense that x----..:: grows to root():: (or all()), but I feel it's more natural that ::x-----.. shrinks to ::none(), not to ::root(). It's even weirder if x+++++..:: sticked to the headmost (non-virtual) revision.

KingMob commented 1 month ago

Regardless of what's ultimately decided on, I think the docs should clarify the exact behavior at the heads and root.

ilyagr commented 1 month ago

We'll need to inspect sub expression x- in (x----::y)&.. and/or propagate warning from there. That would require more code than adding checked_dag_range(x, y)&.. that errors out.

This is an important point; it makes sense that implementing a warning is more difficult than a hard error or the current behavior. IMO, in the ideal world, a warning would strike the right balance, but it's probably not worth the implementation difficulty (we already have a prost-based DSL for parsing; this would be a noticeable change to it).

Based on my previous comments, I'd be OK with root()- being a hard error in theory (since our commit graph model is actually a bit broken here; the whole point of adding root() is for every commit to have a parent, but we didn't finish the job), but in practice it would probably inconvenience more people than it helps.

I don't have a clear conclusion in mind about what, if anything, we should do at this moment.

ilyagr commented 1 month ago

@arxanas 's minus operation is curious. I'll illustrate it, since it took me a bit to understand:

C
| \
B |
|/
A
|
root()

would have minus(C,1) = roots(C|C-)=roots(C|B|A) = A, minus(C,2)=minus(minus(C,1),1)=minus(A,1) = roots(A|A-) = root() (the first equality needs proof, but seems true™️ , you can also compute it direction), and minus(C,3) = minus(root(), 1) = roots(root()|root()-) = roots(root()) = root().

What @arxanas called fixed_points(parents, x, n) is ancestors(x,n), so minus(x, n) = roots(ancestors(x,n)).

For better or worse, there seems to be no way to refer to B with this operation.

It's also worth considering why root() exists to begin with. It ensures that the least upper bound of any two commits always exists, i.e. the commit graph is a "join-semilattice".

I might be corrected, but TBH feel like it was a theoretically imperfect engineering decision. Things are easier to reason about if the working copy commit always has a parent. Then, commands jj squash don't need an extra codepath for there not being a parent (though they do need a codepath for the parent being immutable). Adding root() achieves that. root() itself cannot be the working copy, so no there's no need to bother with an infinite chain of root commits (https://github.com/martinvonz/jj/issues/3821#issuecomment-2148766967) for most purposes (our discussion here is an exception).

I'd say that having root() made less sense before we had the notion of an immutable commit other than root(), but I'd guess @martinvonz was planning ahead. He might also be able to share the actual story.

martinvonz commented 1 month ago

The reason for the root commit is so there's always a common ancestor. I copied the idea from mercurial (cut there it's called the null revision). It removes special cases. See e.g. git rebase --root and git checkout --orphan. Also, it pretty much revolves the need for bare repos because you can instead just check out the root commit (mercurial doesn't have bare repos either).

arxanas commented 1 month ago

but I feel it's more natural that ::x-----.. shrinks to ::none(), not to ::root(). It's even weirder if x+++++..:: sticked to the headmost (non-virtual) revision.

Conceptually, I personally don't find the behavior strange in those cases (probably because I think of most graph operations in terms of fixed-point iteration).
Practically, to decide whether to preserve or change the current -/+ behavior, we should find some common workflows that rely on producing empty sets in some cases. Can we list some such workflows?
- When using -/+ in range expressions, I expect that one basically never (?) wants the empty set behavior.
- For example, are people often running jj describe -r @+ with the intention that it should not edit any commit messages (or error out) if you're at a head node already? It would probably be strange to edit @ in those cases instead of editing nothing or erroring.

Also, it pretty much revolves the need for bare repos because you can instead just check out the root commit (mercurial doesn't have bare repos either).

In those cases, is the working-copy commit the root commit, or a new empty commit on top of it?

What @arxanas called fixed_points(parents, x, n) is ancestors(x,n), so minus(x, n) = roots(ancestors(x,n)).

This is correct (and I actually typically mentally model ancestors to be a fixed-point iteration of parents).

I guess one potentially unexpected behavior is when x has multiple connected components, like a::b | e::f, where a::f is a stick, then roots(ancestors(x)) returns only a, rather than a | e. But I actually frequently use roots(ancestors(x)) (and heads(descendants(x))) to the point where I wonder if it should have its own shorthand function/operator.

For better or worse, there seems to be no way to refer to B with this operation.

That's true, but there's also no way to refer to just B using parents(A), either. You need to use some nuanced individual-parent operation for both if you cared about addressing specifically B.

ilyagr commented 1 month ago

For better or worse, there seems to be no way to refer to B with this operation.

That's true, but there's also no way to refer to just B using parents(A), either.

Well, one invariant this breaks is that ::X = X | X- | X-- | ... but is not necessarily equal to X | minus(X,1) | minux(X,2) | ....

I should say, I do find minus interesting to consider, but not to the point where I'd actually start implementing it; roots(ancestors(..) seems to me like an appropriate name for it.

In theory, I'd argue that - is perfectly fine, but the way we completed the graph of actual commits to real_commits + root() is incomplete in that it kinda suggests that you should be able to rely on every commit having a parent (e.g. the working copy always has a parent, even in an empty repo), but doesn't finish the job. In practical terms, I don't know what I'd argue; Yuya's #3838 might be fine for now and we can see how people feel later.

ilyagr commented 1 month ago

. But I actually frequently use roots(ancestors(x)) (and heads(descendants(x))) to the point where I wonder if it should have its own shorthand function/operator.

I don't, but maybe I should? When do you tend to use them?

It's quite easy to make aliases like ra() or hd() for them. I wonder whether I'd use them...

arxanas commented 1 month ago

Well, one invariant this breaks is that ::X = X | X- | X-- | ... but is not necessarily equal to X | minus(X,1) | minux(X,2) | ....

It's worth pointing out for the first case that it's not just an invariant: the typical definition of the "ancestors" function is the fixed point of the "parents" function, so it's more so true by definition rather than being an incidentally-nice property.

It was perhaps a mistake for me to call the above function fixed_point instead of something like fixed_point_iter, which might have been more accurate, since the returned value is (usually) not a fixed point of the provided function.
I don't recall if typical presentations speak of a fixed point of a function "with respect to" a certain value (like the fixed point of parents with respect to x). If not, then to be fully precise for discussion henceforth, I guess I would say that ancestors(x) is defined to be the least fixed-point of parents for which x is a subset (and likewise for descendants and children).

I don't, but maybe I should? When do you tend to use them?

I'll have to make a note when I do it next. Usually, I think it's because I want to reconnect non-adjacent components somehow.

ilyagr commented 1 month ago

It was perhaps a mistake for me to call the above function fixed_point instead of something like fixed_point_iter, which might have been more accurate, since the returned value is (usually) not a fixed point of the provided function.

I would say that ancestors(x) is defined to be the least fixed-point of parents for which x is a subset (and likewise for descendants and children).

I agree, "fixed point" is confusing here. I don't think the operation of taking parents will ever have meaningful fixpoints, those would be loops in the graph.

parents(ancestors(x)) will not include x, so parents(ancestors(x)) != ancestors(x). For me, this closes the book on whether ancestors are a fixed point of parents.

On second thought, if you define f(x) = x | parents(x) (AKA x | x-), then ancestors are a fixed point of f. I wouldn't call that a fixed point of parents, but that's what you set up. I think if you included more discussion of this function f, it would have been clearer. But I still don't see much advantage of defining ancestors that way, it reduces to the normal definition ancestors(x) = x | x- | x-- | ... that is clear enough. You'd have to argue why this function f is important.

Update: I should have said "ancestors(a) are a fixed point of f for any a". f here is a function on sets of commits.

I'll have to make a note when I do it next. Usually, I think it's because I want to reconnect non-adjacent components somehow.

Interesting!

arxanas commented 1 month ago

I had to jog my memory on what exactly the theory behind "ancestors being the least fixed point of parents" was, and I regret to say that ancestors is indeed the least fixed point of parents in a certain typical logic programming way, which I believe here relies on several levels of abstraction:

the partial ordering of nodes that constitutes the commit graph
the partial ordering of sets
the partial ordering of relations
the composition of relations

Relations themselves can be ordered and can be fixed points.

What I had meant was "the is_ancestor relation is the least fixed point of the is_parent relation", which has a direct equivalent in terms of our ancestors and parents revset functions because they're supposed to represent those relations — although functions in general are not partially-ordered or monotonic under composition like relations are.

The is_ancestor relation manages to be a fixed point "of" the is_parent relation despite the fact that is_parent relation is not a fixed point at all, like you more or less noted. It might be better to say that it's the smallest fixed point "greater than or equal to" is_parent, rather than it being a fixed point "of" is_parent. (I double-checked and it seems that "of" is really the standard terminology here.)

There is also a notion of revsets being "fixed points" for revset functions, rather than revset functions themselves somehow inherently being fixed points without respect to any particular value. This is the intuitive notion discussed already in this thread, which works for many other domains too (like normal functions of numbers).

But it's getting complicated, so I'll have to expand in different thread.

ilyagr commented 1 month ago

"the is_ancestor relation is the least fixed point of the is_parent relation",

is_ancestor might be a "transitive completion" of is_parent in the sense that it's the smallest relation such that:

is_parent(a,b) implies is_ancestor(a,b)
If is_ancestor(a,b) and is_ancestor(b,c), then is_ancestor(a,c), AKA "is_ancestor is transitive".

In this sense, is_ancestor is a fixed point of "transitive completion", where "transitive completion" is considered as a function from the set of relations to itself. Any transitive relation would be.

But I'm just guessing now. I'm not used to talking of things in terms of fixed points.

martinvonz / jj

FR: Warn users of empty revsets when using multiple child/parent operators. (EDITED) #3821