martinvonz / jj

A Git-compatible VCS that is both simple and powerful
https://martinvonz.github.io/jj/
Apache License 2.0
8.05k stars 263 forks source link

FR: Topics (alternative to branches) #3402

Open noahmayr opened 4 months ago

noahmayr commented 4 months ago

Is your feature request related to a problem? Please describe.

There have been several discussions, both on discord as well as spread across github issue and discussion comments (e.g. https://github.com/martinvonz/jj/discussions/2425#discussioncomment-7376935) about a potential jj feature called "topics". They are somewhat related to same concept in mercurial, but their exact implementation and behavior in jj have yet to be decided.

According to a post linked by @arxanas below, there's a disconnect between the most dominant mental model (1) of branches and how they actually work in git/jj (3) image

Having an implementation that's actually associated with all changes of a ~branch~ topic instead of just the head could help bridge that disconnect.

Describe the solution you'd like

As of now, jj has branches which while having the same name as git branches do not behave the same. As jj has no notion of an "active" or "checked out" branch, the head of the branch is not automatically advanced to new commits (see #2338).

The core difference between a topic and a branch is that branches only ever point to the last revision on that branch while topics would be more like a marker/metadata on the revision.

The current consesus is, that topics would be "infectious", meaning new revisions descending from a topic's revision automatically become part of that topic as well.

They would also likely be jj's model for integrating with git, while the existing branches could be renamed to bookmarks.

However there are still open questions:

Potential usage:

  1. Basic CRUD to relate revisions to topics:
    • jj topic set my_topic -r 'trunk()..@' to set the topic my_topic on those revisions (removing all other topics currently associated with them)
    • jj topic add my_topic -r 'trunk()..@' to add the topic my_topic on those revisions (keeping all other topics currently associated with them, this assumes that revisions can have more than one topic)
    • jj topic clear -r '@' clears the revision's topics
    • jj topic remove my_topic -r '@' removes a single topic from a revision (assumes multiple topics per revision)
  2. Integration into existing commands:
    • jj new --topic my_topic to start a new revision that is only associated with my_topic
    • jj new --no-topic to not inherit the ancestors topics
    • jj git fetch could map remote git branches to topics, starting with trunk() and then adding marking the remaining revisions trunk()..<branch head> with a topic named like the remote's branch.
    • jj git push --topic to push the topic as a branch, depending on other flags/configuration this could either try to create as few branches as possible for code review or create a single branch for every revision (kind of like graphite.dev's stacked reviews). When creating multiple branches, instead of naming them push-<change_id> (like jj git push --change) we would use names based on the topic <topic>-<change_id> (see #1415)
    • jj log -r 'topics([...])' and jj log -r my_topic to show revisions of that topic
    • jj abandon --topic my_topic drops all revisions related to that topic
  3. Future commands for managing PRs from jj
    • jj github push --topic / jj gitlab push --topic or my prefered variant jj submit --topic which would automatically create the necessary branches and PRs on the forge you use. Ideally managing branches in jj would not be necessary at all. (see #485)

Describe alternatives you've considered

For interop with the git world, #2338 would be an alternative for being able to work with branches more effectively with possibly further changes down the line making jj branches work more like git branches. However I think topics could provide a more idiomatic jj approach while still providing great interop with branches.

Additional context

This is meant more as a meta issue tracking progress across several different aspects of how topics would integrate into jj. Based on feedback additional use cases might be added or the current ones may be refined further. If specific issues are opened for any individual use case, those will also be linked.

Discord discussion

khionu commented 4 months ago

Could we rename the issue to have a little more context? Maybe "alternative branch story: topics"

icxc12 commented 4 months ago

Recently also looked into jj new --branch br and found the discord discussion to be very helpful. Was also thinking revsets mapped nicely onto branches and was happy to be redirected here (thanks Ilya/Noah).

Here are some thoughts on your open questions:

Can revisions belong to more than one topic:

Think this is pretty useful. For example, if you start to work on a topic, then switch to a new topic based on work from the previous topic, probably want the original revision in both topics.

Can revisions not belong to any topic, or would they belong to a special unnamed topic:

~Feel that having a model that diverges from working copy is fine here -- simply because do not always want to be working on/thinking about topics -- only want to use when it is relevant.~ Edit: This attempted to convey what @necauqua says below, but it’s conveyed better there. Just read that instead.

What revisions can be part of the same topic

Would be curious to hear what others think on this. Do think that there is another option that a topic can include a section of the revset dag (i.e. not just flow in one direction). This would fit nicely with Ilya's suggestion (within the current branching workflow) to be able to do jj new --branch br both "up" and "down" for existing commit id prefix br.

What workflows can we enable with topics, that we would not be able to with branches

The big one for me is just not having to keep track of names of prefixes. Also added benefit of using revsets is you would get all the benefits of revsets in topics (which you currently do not get in branches). Do think it is should not attempt to adhere to Mercurial's topic extensions (for example, basing topics on branches as opposed to viewing topics as a branch alternative) in a way that would compromise git branch interop.

necauqua commented 4 months ago

Would be curious to hear what others think on this.

I was am strongly on the side of unconstrained topics, git interop could be dealt with, but topics just being a list of string 'tags' (not git tags) on each commit in jj metadata is both simple and powerful imo.

Opposed to git branches, which are defined as a pointer to a head from which you manually walk back to root to have an idea what the branch includes - it's hard to quantify, but topics feel like they fit better with the jj model, and the infectiousness fixes the issue with branches not advancing, while just making the branches advance feels like a step back for some reason.

My answers to other questions:

Can revisions belong to more than one topic

Yes 🤷

Can revisions not belong to any topic

Yes 🤷 Actually, this one is simple. Say there is this special unnamed topic. There are two ways it could be done - all revisions have it, or all revisions without any other topics have it. The first one is useless, it's just all(), I only needed to clarify that out of pedantry, the question was about the second one. Whose only purpose can be, I think, to have a way to find revisions that have no other topics - but that could be just a revset function, no need to implement an additional concept that's actually pretty weird if you think about it (a transient pseudotopic that exists when the list of topics is empty, and doesn't when it's not empty).

What workflows can we enable with topics

It's a nice fix to the branches not advancing issue, topics can be disjointed (if they were limited then I'd not see them as much different from branches, it'd be more of a "rename it so it sounds exciting" thing then). Again, thinking about them as each commit having a list of string tags (not git tags) attached enables arbitrary tagging setups to be invented by people

matts1 commented 4 months ago

FWIW, I strongly agree with the use case of topics. A while back, I joined a session where some people were curious about jj and explained it to them, and the biggest feedback that I got was "why does jj punish me for attempting to use my git workflows" (WRT there being no active branch).

However, I think the problem is that different people want different things, and I think we need to acknowledge that no-one is necessary wrong. One thing we may want to consider is to, rather than prescribing our own opinions upon the user, making topics themselves configurable (but have a reasonable set of defaults). For example:

I think that the biggest problem with an approach like the one I just described will be conveying that to the user. With the things above, there are 5 different configurations you could create for a given topic. I can see potential value (with different use cases) for several of them. For example:

I think that even if we don't make topics themselves configurable, we should at the very least make it configurable on the backend level, so that when someone wants another one of these things, the work is then trivial.

PhilipMetzger commented 4 months ago

I very much agree with @necauqua assessment of topics and consider them pretty much additional metadata on a commit.

However, I think the problem is that different people want different things, and I think we need to acknowledge that no-one is necessary wrong. One thing we may want to consider is to, rather than prescribing our own opinions upon the user, making topics themselves configurable (but have a reasonable set of defaults). For example:

  • When you create a new commit, does it:

    • Stay on the old commit (on deletion: do nothing)
    • Get copied to the new commit (on deletion: do nothing)
    • Move to the new commit (on deletion: move to parent)
  • Is a topic unique (not valid for the "copied to new commit" mode)

I think that the biggest problem with an approach like the one I just described will be conveying that to the user. With the things above, there are 5 different configurations you could create for a given topic. I can see potential value (with different use cases) for several of them. For example:

  • Unique, don't move: See FR: Convenient names for changes #3482 - This is useful to create aliases for given commits. It's also useful to associate with a gerrit commit, for example (crrev.com/c/123)
  • Non-unique, don't move: Arbitrary tags you could apply to commits. I've seen requests for this so that you could come up with a tag that you can exclude from the default revset, for example
  • Copy: See other people's comments in this PR
  • Move, unique: This is good for anyone who wants to replicate the design of git branches. This is precisely what the people I got feedback from wanted.
  • Move, not unique: Can't think of any use cases off the top of my head.

So supporting these use-cases should be trivial if we allow arbitrary metadata on commits, which probably should be separate feature from topics which use a subset of the metadata to create "virtual branches".

icxc12 commented 4 months ago

if we allow arbitrary metadata on commits

Was thinking about this as well because branches are currently a HashMap<String, RefTarget>, where RefTarget is effectively a CommitId. Had you given any thought as to where you might want to keep metadata (the commit struct in backend seems like an option, but saying this as someone who is still very new here)?

necauqua commented 4 months ago

We have jj-only commit metadata storage for change ids and a list of predecessors, maybe other things I'm not remembering - seems obvious to just chuck a topics: Vec<String> field there


Also by the way operation objects actually do contain tags: HashMap<String, String> for arbitrary metadata. Currently those are only used to store command args to be shown in the oplog.

Although when I used them to mark snapshot operations Martin refactored that into a separate field - so I guess generic tags thing is not even needed as you could always just add a field directly.
Ok forget that, I think those could be useful for custom backends to do custom stuff without changing the upstream storage format.

Anyway my point is that actually implementing the "list of string (non git) tags on every commit" metadata thing is like super easy actually. And then have commands to CRUD them, revset functions to query them, and maybe something about indexing that I never looked into for "querying them" to be fast (that last part prooobably the hardest?. :upside_down_face: ).

The harder part is arguing about the design here \:) Like I actually think a world where jj has no branches but topics (which are truly a jj concept as we've described above) map to one/multiple git branches with some rules is very interesting.

icxc12 commented 4 months ago

Anyway my point is that actually implementing the "list of string (non git) tags on every commit" metadata thing is like super easy actually.

Thanks this is helpful. Also provides incentive to look into operations more thoroughly.

map to one/multiple git branches with some rules is very interesting

This is the part that still confuses me. Can you explain a bit more at the design level how git interop should work with ~topics~ “unconstrained” topics?

Edit: was specifically interested in interop with “unconstrained” topics.

necauqua commented 4 months ago

The simplest thing would be to only export those topics that do follow the constraints, and for others log hints based on some heuristics or something.

If there's a config switch to flip those hints into hard errors - well that just made topics constrained \:)

Another approach is this - given a set of commits that are marked by some topic, export every head (that is, a commit that's not a parent of any other commit in the set) as a separate branch. For topics that follow the rules this means a single commit will be marked with a branch, and for various disjointed/non-standard ones we could log hints and export multiple branches with some name pattern. Or, again, a config switch that just makes it so that if there's multiple heads we don't export anything or get a hard error - basically turning this into option 1/constrained.

Both of these approaches have been mentioned in discussions here/on discord.


One thing the above does not mention is importing - say git has some branches (e.g. fetched from a remote) and we want branchless-jj-with-topics to see those as topics.

There are two similar approaches I see here:

Or maybe we can mark a single commit that the branch points to, that actually does work too, with the above export method (exporting the heads specifically) it's kind of idempotent?. And then when you jj new that commit the topic gets expanded to the child effectively advancing the branch, which was the point. And say some commits where added to the git branch on remote and you fetch - if the topic already existed I guess you can mark all the commits "between" those that were already marked and the newly pointed to one.

joyously commented 4 months ago

a potential jj feature called "topics". They are somewhat related to same concept in mercurial, but their exact implementation and behavior in jj have yet to be decided.

This is supposed to explain the problem to solve, but it doesn't. Can you expand on the problem definition without referring to a VCS?

cdmistman commented 4 months ago

i'm personally interested in topics, and think that there can be some really neat tooling that's compatible with stacked diffs via this behavior specifically:

Non-unique, don't move: Arbitrary tags you could apply to commits. I've seen requests for this so that you could come up with a tag that you can exclude from the default revset, for example

imagine a command jj github pr create <topic>. this command creates a git branch of the same name as the topic, duplicates each commit in the topic, and rebases/merges them to be on top of each other within the git branch. jj github pr update <topic> might then perform a 'restack' (inserting/updating any git commits in the branch as necessary) before resubmitting the branch to the remote.

maybe this is better as 3rd party tooling, but nonetheless such behavior is unblocked by topics - just sharing my 2 cents on how this might improve my own workflow

arxanas commented 4 months ago

Pasting my comment from https://github.com/martinvonz/jj/issues/3505#issuecomment-2067768197 as I think it's also relevant to this discussion (particularly that I think Git branches satisfy multiple disparate workflows — we should consider how topics address those workflows):


We could consider this from the perspective of how topics intuitively work (/should work), and port the behavior to branches somehow (or change the jj model, use topics natively, and import/export branches somehow).

The confusing cases from the implementation perspective are when multiple branches point to the same commit, which doesn't exactly have a topic analogue.

I would say those cases are the exception. In such cases, branches don't implement the "feature branching" model — they implement something else that we should consider entirely separately. I think there are two main cases:

When you consider the sliding behavior for the feature branch workflow only, it's clear that it doesn't really add value by itself; it's a hack to work around the lack of principled feature branch tracking available in Git.

arxanas commented 4 months ago

To motivate "topics" more, as @joyously points out that there's not much detail in the thread, here's a poll (@jvns 2024-01-06):

poll: how do you think about git branches? (I'll put an image in a reply with pictures for the 3 options)

as with all of these polls obviously all 3 are valid, I'm curious which one feels the most true to you

  • (59%) 1. just the commits that "branch" off
  • (22%) 2. the history of every previous commit
  • (16%) 3. just the commit at the end ("branch = pointer")
  • (3%) other / show results

· 1,966 people · Closed

Notably, a majority of people don't think of branches in terms of how they're actually implemented. This leads to impedance mismatches in some workflows when users try to rely on Git to infer the commits that "belong" to a branch, when it turns out that the concept is not always usefully defined.

For example, there is no way in stock Git to rebase only the commits in a single branch in a stack: with git rebase, you have to either explicitly define the start of the range to rebase (i.e. look up the "parent" branch manually and provide that) or use the implicit default (calculate the merge-base and use that as the start of the range).

Topics are a possible solution that actually matches the typical user's mental model and workflows.

I'll also suggest that having a "currently-checked-out branch" is one more piece of global contextual state that the user has to keep in mind. It might be that there's a pleasant solution to reduce that complexity (but I'm not sure if topics provide it or not).

joyously commented 3 months ago

Notably, a majority of people don't think of branches in terms of how they're actually implemented.

I think this is a good thing since implementation details can change.

I'll also suggest that having a "currently-checked-out branch" is one more piece of global contextual state that the user has to keep in mind.

Very good point. I'm all for reducing the cognitive load on the human. What I was asking for earlier, though, is more like what drives the fundamental goal of jj: thinking in terms of "changes", not commits. What is the actual problem?

If you define what is needed, without regard to the storage backend, then you can move on to how to implement it. From what I've read here, it sounds like everyone wants to label commits. Can you please explain why and the workflow that the label helps, in the context of "changes"?

joyously commented 3 months ago

Someone just posted a link to Dendron (elsewhere). When I looked at it, I wondered if the concept is right for jj. One of the testimonials says:

"Hierarchies [are] such a game changer. I just couldn't organize myself with a flat structure and ended up spreading notes out. The key thing that changed my perspective was the ease of restructuring hierarchies, it gives you the confidence to just write, safe in the knowledge that you can just restructure it later"