Open martinvonz opened 2 years ago
I have been thinking about this a bit. One problem is how to handle working-copy commits.
We could create a new working-copy commit in each submodule when you check out a commit in the superproject. However, there are at least two problems with that. One is that it would make all submodules seem dirty because they point to a new commit, unless we somehow skip an empty working-copy commit when we report to the superproject what the submodule's commit is. Another problem is that we would have to scan the whole tree for submodules. That's particularly bad if you have a monorepo with millions of directories to walk; we don't want to do that every time you check out a commit.
Note that this problem is similar to what the Git project wants to do when you check out a branch. I think their plan is to check out the same branch in each submodule. The problem is smaller there because they presumably only need to do that for submodules that are within the sparse checkout, so that limits the number of directories to walk.
Another option is to somehow lazily create the new working-copy commits in the submodules only if you modify any files. Maybe we would also want to do it if you run jj log
inside of them? If we do, then we'd probably still want to ignore those commits when we report the commit to the superproject so it doesn't seem like the submodule has been updated just because you ran jj log
inside it.
I'm happy to hear if anyone has any thoughts (@chooglen?).
We could create a new working-copy commit in each submodule when you check out a commit in the superproject.
If we're thinking of the superproject and submodules as separate repos that can be interacted with mostly independent of each other (aka the Git model), this seems like the simplest, most intuitive behavior.
But in Git, that extra flexibility in the submodule ends up introducing a lot of unexpected behavior because it lets you meddle with the submodule without knowledge of the superproject and unintentionally break things, e.g. how would you prevent git -C submodule gc
from GC-ing commits that the superproject needs? I'd be in favor of drastically limiting what you can do in a submodule in exchange for safety and simplicity.
One is that it would make all submodules seem dirty because they point to a new commit, unless we somehow skip an empty working-copy commit when we report to the superproject what the submodule's commit is.
If the superproject has a working-copy commit checked out, this doesn't seem like a problem to me since the new submodule commits won't leave the working-copy commit until a jj amend
(or similar) in the superproject. I'm somewhat immune to churn in a working-copy commit because it's all meant to be temporary anyway.
I'm more worried about jj edit
in the superproject + working copy commits in the submodule, since that makes it pretty easy to modify submodules in commits that aren't temporary.
Another option is to somehow lazily create the new working-copy commits in the submodules only if you modify any files.
I think I like this solution best. I'm thinking the submodule repos wouldn't have any attached working copies (they would be "bare" in Git-speak). Their working-copy files would be tracked in the parent project's working-copy state ("index" in Git-speak). That means that jj log
in a submodule would not show where the working-copy is. It might be best to not even allow it. I think it's a little confusing if jj log
shows different history if you're inside a submodule anyway.
How does that sound?
FWIW, I run git log
in a submodule to sanity check where I'm at and whether recent changes are present after a submodule update. In that mode, I'm interested in the repo/commit history rather than the history of individual files.
I'd be fine with jj log
not showing the submodule history in the working copy (honestly, I think it is confusing for the history to change based on the cwd in git anyway!) but it would be great if there was a way to explicitly review the history for a submodule when needed, eg: jj log --submodule foo
or whatever would fit in consistently with the jj
UX.
This topic has turned out to be quite big, so I intend to break this up into smaller bits that we can discuss separately. There are a lot of things we'll have to figure out, but assuming we can't make big changes to the way Git implements submodules, I think these are the big questions:
If these make sense, I'll open those issues (discussions?) separately. Any other big questions I've missed?
If these make sense, I'll open those issues (discussions?) separately.
Sounds good. (I'd make them issues, but discussions would also work.)
What is the on-disk submodules format?
Do you mean for the working copy or the commit storage? Or both? I'm asking now only so you can decide if you want to create one or two separate issues for it (I'm fine with either).
Sounds good. (I'd make them issues, but discussions would also work.)
I think that is what "Projects" is for: you can make a project which groups all relevant issues, and see whether they are done, in progress or not started.
Do you mean for the working copy or the commit storage? Or both?
I think both, but since the two are linked (e.g. Git expects the .git file to point to the gitdir storing the commits), I'll open one discussion for that.
I think that is what "Projects" is for: you can make a project which groups all relevant issues, and see whether they are done, in progress or not started.
That sounds good. Though AFAICT GitHub Projects don't live on a repo (you create a project on your own user account and then link it to one or many repos), and administering both the project and the repo sounds like a pain... I'll fiddle around with it a bit, and if it makes sense, I'll ask @martinvonz to create the project so that at least the repo and project have the same owner.
Description
Submodules are used frequently enough that I think we'll just have to support them.