sourcegraph / sourcegraph-public-snapshot

Code AI platform with Code Search & Cody
https://sourcegraph.com
Other
10.1k stars 1.28k forks source link

git corruption from running `git upload-pack` in a repo while `p4-fusion` is running #57023

Open peterguy opened 1 year ago

peterguy commented 1 year ago

Overview

Similar to #56978, if a git upload-pack is run in the repo while it is being actively cloned via p4-fusion, git corruption can happen.

git upload-pack is called by executors when they need to clone a repo to do a job. The executor runs git fetch which goes through its git proxy, turning into info/refs and upload-pack. git upload-pack spawns a git pack-objects process, which disrupts the refs so that p4-fusion can't continue.

================ From @bfurtonmw:

gitserver itself has an HTTP endpoint like this: http://gitserver-0.gitserver/{repo/path}/git-upload-pack

From the source, this endpoint runs git upload-pack, which calls git pack-objects under the hood.

When git gc runs, it also calls git pack-objects under the hood via git repack.

Here's a cleaned-up ps sample showing git gc and git upload-pack running together, albeit in different folders:

PID   PPID  COMMAND
    1     0 /sbin/tini -- /gitserver run
    7     1 /gitserver run
24557     7 git -c uploadpack.allowFilter=true -c uploadpack.allowAnySHA1InWant=true -c pack.windowMemory=100m upload-pack --stateless-rpc --strict /data/repos/p4/Repo_A/.git
24558 24557 /usr/libexec/git-core/git --shallow-file  pack-objects --revs --thin --stdout --shallow --delta-base-offset
25030     7 git -c gc.auto=1 -c gc.autoDetach=false gc --auto
25033 25030 /usr/libexec/git-core/git repack -d -l --no-write-bitmap-index
25034 25033 /usr/libexec/git-core/git pack-objects --local --delta-base-offset /data/repos/p4/Repo_B/.git/objects/pack/.tmp-25033-pack --keep-true-parents --honor-pack-keep --non-empty --all --reflog --indexed-objects --unpacked --incremental

In theory, the three running together in the same folder could conflict with each other.

================

@bfurtonmw also has steps to clean up the corruption manually, after which p4-fusion can be run again:

> cat .git/packed-refs
# pack-refs with: peeled fully-peeled sorted
6adcbcd19a2eaaf99bd1ab4583371345afd4a96f refs/heads/master
> echo 6adcbcd19a2eaaf99bd1ab4583371345afd4a96f > .git/refs/heads/master
> git --work-tree=${PWD} reset --hard master
> git --work-tree=${PWD} gc

Steps to reproduce:

Reproducing this behavior depends on an alignment of jobs, currently (working on a way to reproduce manually), so it takes some trial and error.

  1. set up a Perforce code host with some depots in it. Use fusionClient with settings that cause frequent small updates instead of infrequent, large updates:
    "fusionClient": {
    "enabled": true,
    "fsyncEnable": true,
    "maxChanges": 32,
    "lookAhead": 16,
    "includeBinaries": false,
    "networkThreads": 16,
    "networkThreadsFetch": 16,
    "retries": 10,
    "printBatch": 10
    }
  2. when the depot is synced, start browsing the repo. Code intelligence will launch a job eventually that will do git upload-pack.
  3. Keep an eye on the processes in gitserver - when you see both processes running, check for repo corruption.

Expected behavior:

No issues with the repo.

Actual behavior:

The repo gets corrupted.

/cc @sourcegraph/source

github-actions[bot] commented 1 year ago

Hey, @sourcegraph/code-search - Batch Changes has been mentioned. Let's take a look.