moonrepo / moon

A build system and monorepo management tool for the web ecosystem, written in Rust.
https://moonrepo.dev/moon
MIT License
2.81k stars 154 forks source link

[bug] Git complains `ambiguous argument 'main': unknown revision` in Gitlab CI pipelines #1038

Closed cherrot closed 1 year ago

cherrot commented 1 year ago

Describe the bug

Recently I'm setting a Gitlab CI pipelines for my repo. I've registered my gitlab-runner as a shell executor. When it's running moon commands in a pipeline, it fails with git error:

Error:   × Process git failed with a 128 exit code.
  │ 
  │ fatal: ambiguous argument 'main': unknown revision or path not in the
  │ working tree.
  │ Use '--' to separate paths from revisions, like this:
  │ 'git <command> [<revision>...] -- [<file>...]'

If I manually run the same command under the same directory of gitlab-runner's workspace, it will pass as expected.

Fail reason I guess

moon has detected it's running in a CI job (I found a related PR !420), then it tries to list affected files comparing to the target branch by running:

git --no-pager diff --name-status --no-color --relative -z main 

But sadly, gitlab-runner didn't fetch the whole repo with refs to the target branch (which is main in my case),instead it did a shallow clone of the source branch and checked out the commit as (detached) HEAD:

$ git status
HEAD detached at 085c82d
nothing to commit, working tree clean

Changing git strategy for gitlab-runner according to its doc (1, 2) doesn't change this behavior.

My question

  1. Is there any best practice to get moon work with Gitlab CI?
  2. Could I somehow disable the CI detection as a workaround?

Environment

Details

``` System: OS: Linux 5.10 Ubuntu 22.04.3 LTS 22.04.3 LTS (Jammy Jellyfish) CPU: (17) x64 Intel(R) Xeon(R) Gold 6330 CPU @ 2.00GHz Memory: 15.96 GB / 17.92 GB Container: Yes Shell: 5.1.16 - /bin/bash Binaries: Node: 20.5.1 - /opt/node/bin/node npm: 9.8.0 - /opt/node/bin/npm pnpm: 8.6.7 - /opt/node/bin/pnpm Managers: Apt: 2.4.10 - /usr/bin/apt pip3: 23.1.2 - /usr/local/bin/pip3 RubyGems: 3.3.5 - /usr/bin/gem Utilities: CMake: 3.22.1 - /usr/bin/cmake Make: 4.3 - /usr/bin/make GCC: 11.4.0 - /usr/bin/gcc Git: 2.34.1 - /usr/bin/git Curl: 7.81.0 - /usr/bin/curl Servers: Nginx: 1.18.0 - /usr/sbin/nginx Virtualization: Docker: 24.0.5 - /usr/bin/docker IDEs: Vim: 8.2 - /usr/bin/vim Languages: Bash: 5.1.16 - /usr/bin/bash Go: 1.21.0 - /opt/go/bin/go Perl: 5.34.0 - /usr/bin/perl Protoc: 3.12.4 - /usr/bin/protoc Python3: 3.10.12 - /usr/bin/python3 Ruby: 3.0.2 - /usr/bin/ruby ```

Additional context

My .gitlab-ci.yml configuration:

default:
  tags: [shell]
  before_script:
    - export PATH=$PWD/node_modules/.bin:$PATH
    - command -v moon || pnpm install
  cache:
    paths:
      - .moon/cache/

stages:
  - test

lint-and-unittest:
  stage: test
  script:
    - MOON_DEBUG_PROCESS_ENV=true MOON_DEBUG_PROCESS_INPUT=true moon run :lint :format :typecheck :test --log trace --updateCache
Complete pipeline log

``` Running with gitlab-runner 16.3.0 (8ec04662) on shell runner in workspace -**, system ID: ** Preparing the "shell" executor Using Shell (bash) executor... Preparing environment Running on xxxx... Getting source from Git repository Fetching changes with git depth set to 20... Initialized empty Git repository in xxx/.git/ Created fresh repository. Checking out 34cd9c7f as detached HEAD (ref is refs/merge-requests/138/head)... Skipping Git submodules setup Restoring cache Checking cache for default-4-non_protected... Runtime platform arch=amd64 os=linux pid=375880 revision=8ec04662 version=16.3.0 No URL provided, cache will not be downloaded from shared cache server. Instead a local version of cache will be extracted. Successfully extracted cache Executing "step_script" stage of the job script $ export PATH=$PWD/node_modules/.bin:$PATH $ command -v moon || pnpm install Scope: all 6 workspace projects Lockfile is up to date, resolution step is skipped Progress: resolved 1, reused 0, downloaded 0, added 0 Packages: +1198 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ Packages are hard linked from the content-addressable store to the virtual store. Content-addressable store is at: /data/.pnpm-store/v3 Virtual store is at: node_modules/.pnpm Progress: resolved 1198, reused 175, downloaded 0, added 167 Progress: resolved 1198, reused 887, downloaded 0, added 885 Progress: resolved 1198, reused 1191, downloaded 0, added 1198, done devDependencies: + @moonrepo/cli 1.12.0 + @typescript-eslint/eslint-plugin 6.4.1 + @typescript-eslint/parser 6.4.1 + eslint 8.48.0 + eslint-plugin-workspaces 0.9.0 + git-conventional-commits 2.6.5 + prettier 3.0.2 + prettier-plugin-tailwindcss 0.5.3 + simple-git-hooks 2.9.0 + tsconfig-moon 1.3.0 + typescript 5.2.2 . prepare$ simple-git-hooks && ./scripts/install.sh . prepare: [INFO] Successfully set the pre-commit with command: pnpm moon run :lint :format :typecheck :test --affected --status=staged . prepare: [INFO] Successfully set the commit-msg with command: pnpm git-conventional-commits commit-msg-hook -c .commitlintrc.yaml "$1" . prepare: [INFO] Successfully set all git hooks . prepare: Installing connectrpc.com/connect/cmd/protoc-gen-connect-go to ../node_modules/.bin/protoc-gen-connect-go . prepare: Installing github.com/golangci/golangci-lint/cmd/golangci-lint to ../node_modules/.bin/golangci-lint . prepare: Installing github.com/onsi/ginkgo/v2/ginkgo to ../node_modules/.bin/ginkgo . prepare: Installing golang.org/x/tools/cmd/goimports to ../node_modules/.bin/goimports . prepare: Installing google.golang.org/protobuf/cmd/protoc-gen-go to ../node_modules/.bin/protoc-gen-go . prepare: Installing mvdan.cc/gofumpt to ../node_modules/.bin/gofumpt . prepare: Done Done in 12.8s $ MOON_DEBUG_PROCESS_ENV=true MOON_DEBUG_PROCESS_INPUT=true moon run :lint :format :typecheck :test --log trace --updateCache [DEBUG 2023-09-05 13:57:10 log Running moon v1.12.0 (with xxx/node_modules/.pnpm/@moonrepo+cli@1.12.0/node_modules/@moonrepo/cli/moon) log.target="moon" log.module_path="moon_cli" log.file="crates/cli/src/lib.rs" log.line=64 ... omit irrelevant trace logs ... [DEBUG 13:57:10] moon_process::command_inspector Running command git hash-object --stdin-paths - .moon/tasks/go.yml .moon/tasks/node.yml .moon/tasks/protobuf.yml .moon/tasks/tag-docker.yml .moon/tasks/tag-helm.yml .moon/tasks/tag-next.yml .moon/toolchain.yml .moon/workspace.yml apps/auth-proxy/moon.yml apps/charbench-api/moon.yml apps/oasis-agent/moon.yml apps/oasis-api/moon.yml apps/oasis-mobile/moon.yml apps/oasis-web/moon.yml apps/overtake-api/moon.yml apps/overtake-web/moon.yml deploy/helm-oasis/moon.yml packages/builder/moon.yml packages/common/moon.yml packages/proto/moon.yml env_vars={} working_dir="/xxx" [TRACE 13:57:10] moon_hash::hasher Created new content hasher label="Project graph" [TRACE 13:57:10] moon_hash::hasher Adding content to hasher label="Project graph" [DEBUG 13:57:10] moon_hash::hasher Generated content hash label="Project graph" hash="004cfe064c1bdf99fb88e7664fd5b9966d59ba1cc4dc64acd92d4d112748a291" [DEBUG 13:57:10] moon_hash::hash_engine Saving hash manifest label="Project graph" manifest="xxx/.moon/cache/hashes/004cfe064c1bdf99fb88e7664fd5b9966d59ba1cc4dc64acd92d4d112748a291.json" [DEBUG 13:57:10] moon_project_graph::project_graph_builder Generated hash for project graph hash="004cfe064c1bdf99fb88e7664fd5b9966d59ba1cc4dc64acd92d4d112748a291" [DEBUG 13:57:10] moon_cache_item::cache_item Cache hit, reading item cache="xxx/.moon/cache/states/projects.json" [DEBUG 13:57:10] moon_project_graph::project_graph_builder Loading project graph with 13 projects from cache cache="xxx/.moon/cache/states/partialProjectGraph.json" [DEBUG 13:57:10] moon_project_graph::project_graph_builder Enforcing project constraints [DEBUG 13:57:10] moon_project_graph::project_graph Creating project graph [DEBUG 13:57:10] log Querying for touched files log.target="moon:query:touched-files" log.module_path="moon_cli::queries::touched_files" log.file="crates/cli/src/queries/touched_files.rs" log.line=37 [DEBUG 13:57:10] moon_process::command_inspector Running command git --version env_vars={} working_dir="xxx" [DEBUG 13:57:10] moon_process::command_inspector Running command git branch --show-current env_vars={} working_dir="xxx" [TRACE 13:57:10] log Against remote using base "main" with head "HEAD" log.target="moon:query:touched-files" log.module_path="moon_cli::queries::touched_files" log.file="crates/cli/src/queries/touched_files.rs" log.line=62 [DEBUG 13:57:10] moon_process::command_inspector Running command git merge-base main HEAD env_vars={} working_dir="xxx" [DEBUG 13:57:10] moon_process::command_inspector Running command git merge-base origin/main HEAD env_vars={} working_dir="xxx" [DEBUG 13:57:10] moon_process::command_inspector Running command git merge-base upstream/main HEAD env_vars={} working_dir="xxx" [DEBUG 13:57:10] moon_process::command_inspector Running command git merge-base HEAD env_vars={} working_dir="xxx" [DEBUG 13:57:10] moon_process::command_inspector Running command git --no-pager diff --name-status --no-color --relative -z main env_vars={} working_dir="xxx" Error: × Process git failed with a 128 exit code. │ │ fatal: ambiguous argument 'main': unknown revision or path not in the │ working tree. │ Use '--' to separate paths from revisions, like this: │ 'git [...] -- [...]' Cleaning up project directory and file based variables ERROR: Job failed: exit status 1 ```

sybernatus commented 1 year ago

It's more related to gitlab behavior. The runner by default, will not fetch all branches and the full history. You can change this behavior by setting the following parameters to your pipeline :

You can also update them on your gitlab settings from the UI. (settings > CI/CD > General pipelines)

cherrot commented 1 year ago

Thanks. Currently I'm using GIT_DEPTH: 0 (leaving GIT_STRATEGY to fetch as its default value) to bypass this issue. But I'm wondering if this is a good practice. According to Gitlab's doc Optimize GitLab for large repositories:

  • Always fetch incrementally. Do not clone in a way that results in recreating all of the worktree.
  • Always use shallow clone to reduce data transfer. Be aware that this puts more burden on GitLab instance due to higher CPU impact.

GitLab and GitLab Runner perform a shallow clone by default.

Ideally, you should always use GIT_DEPTH with a small number like 10. This instructs GitLab Runner to perform shallow clones. Shallow clones make Git request only the latest set of changes for a given branch, up to desired number of commits as defined by the GIT_DEPTH variable.

This significantly speeds up fetching of changes from Git repositories, especially if the repository has a very long backlog consisting of number of big files as we effectively reduce amount of data transfer.

The following example makes the runner shallow clone to fetch only a given branch; it does not fetch any other branches nor tags.

variables:
  GIT_DEPTH: 10

test:
  script:
    - ls -al
milesj commented 1 year ago

For moon to work correctly in CI (or any build system/task runner), it requires the entire git history. The reason for this is that it needs to compare across branches, and especially diff between HEAD and main/master. This isn't possible with shallow clone, and using a small depth number also isn't ideal, because the difference could be 11 commits, or 100, or even 1000. It's impossible to know, so grabbing the entire history is easiest.

The only alternative is to not use moon ci, and to use moon run, and manually do the diffing yourself, assuming you know all the branch constraints.

cherrot commented 1 year ago

Actually I am using moon run in my CI job:

moon run :lint :format :typecheck :test

If I run this command manually under the shallow cloned workspace of Gitlab CI job, it would pass with no error.

milesj commented 1 year ago

You would need to use a combination of --affected and --remote for this to work correctly: https://moonrepo.dev/docs/commands/run#affected

Otherwise this will always run, or never run, depending on the state of the checkout.

cherrot commented 1 year ago

Thanks. Currently I'm using GIT_DEPTH: 0 (leaving GIT_STRATEGY to fetch as its default value) to bypass this issue.

In case of anyone being interested in this topic, I would recommend just disabling the shallow clone feature :)