newren / git-filter-repo

Quickly rewrite git repository history (filter-branch replacement)
Other
8.52k stars 708 forks source link

Unable to create 'myrepo.git/./packed-refs.lock': File exists. Confirmed that ubuntu volume is case sensitive #590

Open harshita-gupta opened 3 months ago

harshita-gupta commented 3 months ago

trying to run git-filter-repo on a large repo:

ubuntu@ip-xxx:/data3/myrepo.git$ git filter-repo --strip-blobs-bigger-than 1000K
Processed 3985545 blob sizes
Parsed 2024214 commits
New history written in 13118.31 seconds; now repacking/cleaning...
Repacking your repo and cleaning out old unneeded objects
fatal: Unable to create '/data3/myrepo.git/./packed-refs.lock': File exists.

Another git process seems to be running in this repository, e.g.
an editor opened by 'git commit'. Please make sure all processes
are terminated then try again. If it still fails, a git process
may have crashed in this repository earlier:
remove the file manually to continue.
fatal: failed to run pack-refs
Completely finished after 13153.73 seconds.

I verified that the volume is case sensitive via:

ubuntu@ip-xxx:/data3/myrepo.git$ touch helloworld
ubuntu@ip-xxx:/data3/myrepo.git$ less HellOWORLD
HellOWORLD: No such file or directory
newren commented 3 months ago

Ooh, this is interesting; I haven't seen one of these before. We've had lots of reports of failure to lock refs, but always with branch/tag names, never the pack-refs file. Also, the ./ within the paths is really weird. So, a few questions:

harshita-gupta commented 3 months ago
  • Are any git processes running on your machine right now? If so, can you kill them all and ensure no git processes are running?

ps auxr | grep -i git shows no results

  • Are there any (leftover) .lock files within your myrepo.git directory? (Find them with find myrepo.git -name "*.lock")

find myrepo.git -name "*.lock" output is empty

  • What are all the refs in this repository (i.e. output from git show-ref)?

there's a LOT of refs, so I won't copy-paste the output here. git show-ref | wc -l yields 2324620. This is a really large repo that I'm trying to clean up, for obvious reasons :) I created a clone from github using -mirror, and am running git-filter-repo on it

newren commented 3 months ago
  • What are all the refs in this repository (i.e. output from git show-ref)?

there's a LOT of refs, so I won't copy-paste the output here. git show-ref | wc -l yields 2324620. This is a really large repo that I'm trying to clean up, for obvious reasons :) I created a clone from github using -mirror, and am running git-filter-repo on it

Yeah, 2 million refs is quite a few. How about git show-ref | grep -i packed-refs? Does that show any hits, by chance?

harshita-gupta commented 3 months ago

How about git show-ref | grep -i packed-refs? Does that show any hits, by chance?

Unfortunately nothing :(

harshita-gupta commented 3 months ago
  • What are all the refs in this repository (i.e. output from git show-ref)?

there's a LOT of refs, so I won't copy-paste the output here. git show-ref | wc -l yields 2324620. This is a really large repo that I'm trying to clean up, for obvious reasons :) I created a clone from github using -mirror, and am running git-filter-repo on it

Yeah, 2 million refs is quite a few.

and I'm realizing it's worth calling out-- this is the result of wc -l on git show-ref after the git-filter-repo operation fails. Notably, this is after git-filter-repo has logged "New history written in 13118.31 seconds; now repacking/cleaning...". I'm guessing that "Repacking your repo and cleaning out old unneeded objects" failed, which is what yields that awful 2M refs number

When I run git show-ref | wc -l on a -mirror clone of my repo that I have not run git-filter-repo on, I get 308708

harshita-gupta commented 3 months ago

I'm interested in both a short-term and a long-term fix here. I'd love to see this issue resolved, but I'm also curious if there's a set of 'vanilla git' operations I can run here that are equivalent to the failed steps in git-filter-repo? It like a lot of the rewriting did succeed, but something in a cleanup/repack step went wrong? If I can manually execute those pieces after the failure, it'll help me continue with evaluating/testing git-filter-repo while we figure out what the durable fix

newren commented 3 months ago

So, after a bit of digging:

fatal: failed to run pack-refs

This comes from the gc step:

$ git grep 'failed to run %s'     
builtin/gc.c:#define FAILED_RUN "failed to run %s"
$ git grep 'FAILED_RUN.*pack-refs'
builtin/gc.c:           die(FAILED_RUN, "pack-refs");

As such, it did complete the import and was just "doing cleanup". The cleanup commands consist of running git reflog expire --expire=now --all (and it ran to completion already for you), git gc --prune=now (which hit the error when it trying to run git pack-refs --all --prune as a sub-command), and git reset --hard. If it had successfully run all three commands (as opposed to just the first), literally the next thing was printing "Completely finished after {:.2f} seconds". And a gc can be repeated with no problem, so to pick up where it left off, just run:

However, as far as long term solutions, from every angle I look at this the only thing I can imagine is an ill-timed simultaneous git process of some sort running in the repo while it was trying to repack. That could certainly trigger this problem. The best way to tell would be trying to redo the filter-repo in another fresh clone, and be careful to not allow other git processes (anything not started by the git filter-repo process or its child processes) to run at the same time, and see if you can duplicate. I know this takes over three hours even after you've got the clone made, but I'm really curious if this is reproducible.

harshita-gupta commented 3 months ago

thank you, this is very helpful!

an ill-timed simultaneous git process of some sort running in the repo while it was trying to repack

what might lead to this? I know that there's no user-triggered git processes running in this repo; are you imagining it's some background process like git maintenance?

harshita-gupta commented 3 months ago

ooh, fascinating. I tried to re-run git gc and get:

/data3/myrepo.git$ git gc --prune=now

fatal: unable to write new packed-refs: unable to create file /data3/myrepo.git/./packed-refs.new: File exists
fatal: failed to run pack-refs

running rm /data3/myrepo.git/./packed-refs.new seems to fix it.

I was able to successfully run git gc:

/data3/myrepo.git$ git gc --prune=now
Enumerating objects: 18307177, done.
Counting objects: 100% (18307177/18307177), done.
Delta compression using up to 32 threads
Compressing objects: 100% (3339889/3339889), done.
Writing objects: 100% (18307177/18307177), done.
Selecting bitmap commits: 2013396, done.
Building bitmaps: 100% (707/707), done.
Total 18307177 (delta 14460569), reused 18280847 (delta 14435468), pack-reused 0
Collecting referenced commits: 2011185, done.

but since this is a mirror clone, git reset --hard fails with fatal: this operation must be run in a work tree

at the end of this, git sizer and git show-ref | wc -l show that there's still 2M refs, which is much higher than the original number of reps in the repo. Any idea why git-filter-repo would baloon the number of refs in this way?

newren commented 3 months ago

an ill-timed simultaneous git process of some sort running in the repo while it was trying to repack

what might lead to this? I know that there's no user-triggered git processes running in this repo; are you imagining it's some background process like git maintenance?

If there's no user-triggered git processes running, then yeah I'd be expecting a system-triggered git process to do it. git maintenance, if you have it configured, is definitely a possibility. Any other git commands that might be run via cron (particularly, git-pack-refs, git-repack, and git-gc related ones) might also factor in. If you think this is unlikely, trying to redo the filtering in a new clone of this repo would be a good way to check; if it doesn't hit the same error, then I think this was likely the cause. If it does hit the same error, either you got unlucky with e.g. git maintenance timing again, or, more likely, there is something weird in git-filter-repo or its subprocesses that somehow is trying to lock the packed-refs file twice. But, if something really did try to lock the packed-refs file twice, I would think I'd have another bug report about it by now, and yours is the first. But maybe it depends on repository shape in some way? Personally, I think it more likely that you have/had some other git process running simultaneously on your system not started by git-filter-repo, but I don't know your system or what's running on it, so I need you to sanity check that hypothesis.

but since this is a mirror clone, git reset --hard fails with fatal: this operation must be run in a work tree

Sorry, I forgot you mentioned this was a mirror clone. The git reset --hard is conditional, and only runs in non-bare repos, but mirror clones are bare. So, the git gc --prune=now was the only other thing that needed to run it, and you've now run it, so you should be good to go.

at the end of this, git sizer and git show-ref | wc -l show that there's still 2M refs, which is much higher than the original number of reps in the repo. Any idea why git-filter-repo would baloon the number of refs in this way?

Yes, older versions of git-filter-repo by default would create a replace ref for each rewritten commit, which provided a way for you to feed old (unabbreviated) commit IDs to the git-cli and have git translate that to the new commit IDs. These replace refs aren't pushed by default (because only refs/heads/ and refs/tags/ are pushed by default, not refs/replace/), and existing git forges ignore them, so they are only useful locally. I found most folks ended up not using them. Anyway, if you nuke refs/replace/, you should see the expected number of refs.