ingydotnet / git-subrepo

MIT License
3.18k stars 263 forks source link

Sparse checkout and .gitignore not applying #624

Open JWCS opened 2 weeks ago

JWCS commented 2 weeks ago
  o "Put remote subrepo content into '$subdir/'."
  RUN git read-tree --prefix="$subdir" -u "$subrepo_commit_ref"

https://github.com/ingydotnet/git-subrepo/blob/73a01294bf1bef2e746d6c4c3054decf79dafa8f/lib/git-subrepo#L872

For the record, what I want is to have some sort of partial/sparse cloning of subrepo's. I had thought this was working (see #551 ), but I belatedly discovered that the entirety of the original repo is still included in the commit, and will subsequently show up on others' pc's who do not have these sparse checkout lines added. (It only seemed to work locally).

This line, specifically, is the issue, for both of the two different ways of ignoring/excluding files (.gitignore and sparse-checkout).

With the -u, the check for ignored files (via .gitignore) is bypassed, and the whole tree is loaded into subdir. This can be worked around, see for example:

git read-tree --prefix="$subdir" -u "$subrepo_commit_ref"
git restore --staged "$subdir" && git add "$subdir"
git commit ...
git clean -dxf "$subdir"

The problem with .gitignore, is that it's easy to have a higher-up .gitignore / rules unintentionally hide actually desired files, so it's not a great default setting. But, if made as an optional setting, obedience to .gitignore would have to be remembered for future pull's... which is a bigger can of worms.

For sparse-checkout, with a sparse pattern like mentioned in #551, after read-tree -u, the only files ls -a "$subdir" that show up are as according to the sparse pattern, but all the files unfortunately are added to the index, and therefore invisibly committed in. Instead, when added with no -u, perhaps I've messed up the rules, but that leads to all the subdir files in the index, and all the subdir files deleted in the filesystem (unstaged), and no clear way to add in such a way that the original intent was maintained.

In hindsight, sparse-checkout doesn't seem to be the tool for preventing files from getting added in the first place; that's gitignore. And as potentially finicky as it may be, there doesn't seem to be any other clean solution. As a temporary stop-gap, this is what I'm using to get filtered commits (although, for our team/deps, wrapped up in some machinery).

git subrepo clone REMOTE SUBDIR --branch SHA
git ls-files SUBDIR | xargs git check-ignore --no-index | xargs git rm ; git commit --amend --no-edit