martinvonz / jj

A Git-compatible VCS that is both simple and powerful
https://martinvonz.github.io/jj/
Apache License 2.0
8.23k stars 274 forks source link

git submodules in colocated repos create weird phantom file creations/deletions #4349

Open dbarnett opened 2 weeks ago

dbarnett commented 2 weeks ago

Description

I found that if I use git submodules in a colocated repo, jj will get confused as I move the working copy around and introduce phantom file creations/deletions.

Steps to Reproduce the Problem

  1. jj git clone git@github.com:insanum/gcalcli.git --colocate
  2. cd gcalcli
  3. git submodule update --init
  4. jj new zrv

Possible repro also involves a .gitkeep file there, or some other peculiarity in that repo.

Expected Behavior

jj shows the new change I just created as "(empty)" with no diff.

Actual Behavior

New change is non-empty and includes a bunch of spurious edits under the submodule paths:

Working copy changes:
A tests/cli/bats/.codespellrc
A tests/cli/bats/.devcontainer/Dockerfile
(...)
A tests/cli/test_helper/bats-support/test/cat
A tests/cli/test_helper/bats-support/test/test_helper.bash

If I try to jj abandon, I keep getting more new changes with spurious edits.

If I then move the working copy back past main, I see similar results with file deletion operations:

$ jj new main
Working copy now at: qonswokm 6fd66464 (empty) (no description set)
Parent commit      : kntwpymn 9b65c9a5 main | (empty) Merge pull request #716 from insanum/metadata
Added 8 files, modified 10 files, removed 415 files
Warning: 3 of those updates were skipped because there were conflicting changes in the working copy.
Hint: Inspect the changes compared to the intended target with `jj diff --from 6fd66464a519`.
Discard the conflicting changes with `jj restore --from 6fd66464a519`.

$ jj status
Working copy changes:
D tests/cli/bats
A tests/cli/bats/test/.bats/run-logs/.gitkeep
D tests/cli/test_helper/bats-assert
D tests/cli/test_helper/bats-support

Specifications

martinvonz commented 2 weeks ago

New change is non-empty and includes a bunch of spurious edits under the submodule paths:

That's working as intended for now. git submodule update --init added a bunch of files that jj didn't expect to see there.

If I try to jj abandon, I keep getting more new changes with spurious edits.

I got one:

$ jj st
Working copy changes:
A tests/cli/bats/test/.bats/run-logs/.gitkeep
Working copy : l 8c0d29a3 (no description set)
Parent commit: z 006c8158 (empty) Merge pull request #712 from insanum/respect_uid

Abandoning again got rid of it.

dbarnett commented 2 weeks ago

I got one ... Abandoning again got rid of it.

Oh interesting, yes abandoning a 2nd time seems to get rid of everything.

But then jj new main after that still always includes deletion operations that I can't abandon.

dbarnett commented 2 weeks ago

Went diving through code to see if I could figure out where to get started fixing this. test_local_working_copy::test_gitsubmodule looks like it might be close to the repro case we'd need, maybe just not quite covering the right case to trigger the bug.

martinvonz commented 2 weeks ago

Indeed, that looks very close. And it does look like we try to ignore submodule directories completely when snapshotting (unlike what I told @arxanas on Discord).

dbarnett commented 2 weeks ago

Yup, it even prints out some messages about "ignoring submodules" somewhere in there. I suspect what's happening is that when I move the working copy back in time, the files would need to both (a) stop being considered as submodules and (b) disappear from the working copy, but something about the sequence of those operations in a colocated repo is stepping on itself.

BTW you mentioned this bug would only exist in colocated repos... is there a way to invoke git submodules in a non-colocated repo? Or were you just saying it's only a bug in colocated repos because that's the only setup where you can invoke submodules at all?

martinvonz commented 2 weeks ago

Or were you just saying it's only a bug in colocated repos because that's the only setup where you can invoke submodules at all?

Yes, that :)

yuja commented 2 weeks ago

I think this is basically the same as un-ignored files problem. jj works as if submodules were .gitignore-d.

btw, is it intentional that snapshot() captures contents in nested repositories? I'm not quite sure about git, but hg ignores nested repos.

dbarnett commented 2 weeks ago

Ok, it makes a little more sense to me now why it does work that way in practice, after spending some time tinkering with submodule cases. In vanilla git, moving the working copy backwards does leave untracked (but unignored) paths in the working copy, so it makes sense why jj would automatically add those like any other file.

What I don't understand is why when I move the working copy back forwards again, jj is showing the directories as explicitly "deleted". It looks like if I jj abandon twice after step 4, jj thinks the working copy is clean but the submodule dirs still exist with tiny .git files (with contents like gitdir: ../../../.git/modules/tests/cli/bats), and things get weird from there if I don't manually delete those files. It's also very confusing that jj shows them as "deleted" directories because of the presence of certain paths...

Anyway, IIUC the the long term plan would be for jj to have its own first-class support for submodules managed by jj vs. git, and it'd just be a nice-to-have for it to not get too confused by git doing its thing in colocated repos?

yuja commented 2 weeks ago

What I don't understand is why when I move the working copy back forwards again, jj is showing the directories as explicitly "deleted".

Perhaps, it's a bug of update() or snapshot(). (I don't know which one to blame.) Because ".git" file isn't snapshotted, the file remains on disk. And the parent directory can't be "replaced" with a submodule entry because there are unmanaged files.