kubernetes / git-sync

A sidecar app which clones a git repo and keeps it in sync with the upstream.
Apache License 2.0
2.14k stars 409 forks source link

Clone the repo to a subdirectory under --root #821

Closed nan-yu closed 8 months ago

nan-yu commented 9 months ago

The git repo is currently cloned into the --root directory. If any git command fails, an error file will be generated under --root. This is problematic because it removes all contents under --root if git clone fails, including the error file. The presence of the error file indicates a prior failure. If git clone is very slow and fails again, the fact that the error file is deleted incorrectly suggests the previous clone was successful.

We can't simply skip removing the error file because git clone requires the destination directory to be empty. This commit creates a subdirectory under --root with the repo name, and clones the repo to the subdirectory. The error file is peer to the subdirectory. With this approach, if git clone fails, it just deletes the subdirectory, with the error file is retained. The symlink still points to the worktree, with an extra subdirectory in the path.

An alternative is to make the error file an absolute path, instead of relative to --root. However, it is possible that the absolute path is under --root. Then it requires additional validation or error handling. Therefore, this approach is not chosen.

k8s-ci-robot commented 9 months ago

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: nan-yu

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files: - ~~[OWNERS](https://github.com/kubernetes/git-sync/blob/release-3.x/OWNERS)~~ [nan-yu] Approvers can indicate their approval by writing `/approve` in a comment Approvers can cancel approval by writing `/approve cancel` in a comment
nan-yu commented 9 months ago

If the error file is in the --root, we can never call git clone again - it won't run unless the directory is empty.

Am I missing something?

If your e2e had a second git-sync which is expected to pass, I think it will fail.

Oh, right, git clone requires the destination to be empty. 😞 I think the new test didn't fully cover the scenario.

I updated the PR to add a subdirectory under --root, so we can delete the subdirectory and keep the error file if git clone fails. I also refined the e2e test to check the error file changes with continuous git-sync loops.

nan-yu commented 9 months ago

The simpler answer would be to not write the error into --root, wouldn't it?

I considered that solution. If --error-file is not under --root, then it needs to be an absolute path, instead of relative to --root. Given that the full path is provided by users, it is likely the absolute path is still under --root. We'll need to add validation and error handling for --error-file.

nan-yu commented 9 months ago

Before considering making such a change to v3, can we evaluate what v4 (master branch) would look like? It doesn't use git-clone anymore - it uses git-fetch, which I think will not have this problem.

I just noticed the first official v4 release was out last month. Sure, we can definitely evaluate the possibility to use v4.

nan-yu commented 8 months ago

Abandon the change given that v3 is in maintenance mode.