rycus86 / githooks

Githooks: per-repo and global Git hooks with version control
MIT License
382 stars 20 forks source link

Argument list too long #166

Open mjk-gh opened 1 month ago

mjk-gh commented 1 month ago

When adding a larger amount of files and directories to an existing (nearly empty) git repository, "git commit -a" leads to githooks complaining about "Argument list too long".

$ git add .
$ git commit -a -m "Import V7" 
/home/mjk/.githooks/release/base-template.sh: 856: git: Argument list too long
/home/mjk/.githooks/release/base-template.sh: 250: git: Argument list too long
/home/mjk/.githooks/release/base-template.sh: 255: git: Argument list too long
/home/mjk/.githooks/release/base-template.sh: 856: git: Argument list too long
/home/mjk/.githooks/release/base-template.sh: 250: git: Argument list too long
/home/mjk/.githooks/release/base-template.sh: 255: git: Argument list too long
/home/mjk/.githooks/release/base-template.sh: 856: git: Argument list too long
/home/mjk/.githooks/release/base-template.sh: 250: git: Argument list too long
/home/mjk/.githooks/release/base-template.sh: 255: git: Argument list too long
[master 213c872] Import V7
 4196 files changed, 3620500 insertions(+)
[...]
$ fd -t d -E .git | wc -l  # number of directories (exclude ".git")
443
$ fd -t f -E .git | wc -l  # number of files (exclude ".git")
4154
$ getconf ARG_MAX
2097152
$ git hooks version

Githooks - https://github.com/rycus86/githooks
----------------------------------------------

Version: 2303.282335-a988d0
Commit: a988d08 (Only print skipping disabled hooks once a day :sparkles:, 2023-03-28)

I am not sure if that error message comes from git or from the shell, but maybe it would be possible to prevent it from happening by patching githooks: Either by using something like xargs, or by storing the argument list in a temporary file, or maybe this Stackoverflow answer points to the right direction:

------------ snip ------------ The "right answer", though, is probably to have the software itself be smarter. There's no need to invoke git add this particular way, with every file listed as one big argv vector. Indeed, instead of invoking git add directly, a program should probably be invoking git update-index instead. A Python program should probably be using update-index with the --stdin and -z flags (and any other flags as appropriate for this particular function's intended usage). ------------ snip ------------

rycus86 commented 1 month ago

I've seen this log as well, it might be interesting to handle this edge case better. It's interesting which lines the messages refer to though, as those seem to be doing:

SHARED_HOOKS=$(git config --local --get-all githooks.shared)
SHARED_HOOKS=$(git config --global --get-all githooks.shared)
LAST_UPDATE=$(git config --global --get githooks.autoupdate.lastrun)

These seem unrelated to the number of files we're about to process. 🤔 At any rate, we should be able to craft a test case that adds too many files then assert that we can commit them while running githooks without seeing those error messages.

Is this something you might want to attempt?

mjk-gh commented 1 month ago

I am also puzzled as to why the error messages occur in these lines ... prepending "git config ..." with "echo " moves the "Argument list too long" to different lines, namely to lines 650, 550, and 551:

git config --get-all githooks.sharedHooksUpdateTriggers | grep -q "$HOOK_NAME" && RUN_UPDATE="true"

.

At any rate, we should be able to craft a test case that adds too many files then assert that we > can commit them while running githooks without seeing those error messages.

Is this something you might want to attempt?

Low on resource (as always :-), but I might try to at least assist.

On my system (Devuan, 64 bit, zsh), the error message occurs when the combined length of the filenames and (presumably) the number of space characters between them in (whatever) argument list is greater than or equal to

255*511+223 + 511 (spaces) = 131039

No idea why it is 33 bytes short of 131072 (2^17), maybe just coincidence?

Example:

$ git init repo
$ cd repo
$ git hooks install
[...]
$ printf '%0255d ' {1..511} | xargs touch
$ touch $(printf '%0223d' 512)
$ git add .
$ git ci -m init |& head -n 1
/home/test/.githooks/release/base-template.sh: 856: git: Argument list too long
$

Of course, for test purposes, just creating a long enough argument list with a single line

$ printf '%0255d ' {1..512} | xargs touch

should do.

Replacing the appropriate git config calls by calls to a wrapper script that saves stdin reveals that stdin is empty. Putting tee /tmp/gh.stdin at the beginning of base-template.sh gets me an empty file, so no stdin, either.

🤔️