GitGuardian / ggshield

Find and fix 400+ types of hardcoded secrets and 70+ types of infrastructure-as-code misconfigurations.
https://gitguardian.com
MIT License
1.68k stars 150 forks source link

ggshield crashes when "git merge"ing files with spaces #991

Closed mherzberg closed 1 week ago

mherzberg commented 2 weeks ago

Environment

Describe the bug

ggshield crashes when the user performs a git merge that involves file names with spaces. Some line splitting code will output more results than expected, leading to a crash.

Steps to reproduce:

mkdir mergetest; cd mergetest
git init

git checkout -b b1
echo "" > "test file.txt"
git add "test file.txt"; git commit -m "Test"

git checkout -b b2
echo "foo" > "test file.txt"
git add "test file.txt"; git commit -m "Test"

git checkout b1
echo "bar" > "test file.txt"
git add "test file.txt"; git commit -m "Test"

git merge b2

Actual result:

Now you can run the following command to see the verbose output of the crash:

ggshield secret scan pre-commit --verbose

Error: too many values to unpack (expected 4)

Traceback (most recent call last):
    File "ggshield\cmd\utils\common_decorators.py", line 20, in wrapper
    File "ggshield\cmd\secret\scan\precommit.py", line 85, in precommit_cmd
    File "ggshield\core\scan\commit.py", line 88, in from_merge
    File "ggshield\core\scan\commit_utils.py", line 372, in get_file_sha_in_ref
ValueError: too many values to unpack (expected 4)

Users hit the error when they have the pre-commit hook installed and continue their merge:

git add "test file.txt"
git merge --continue

The last command fails with ValueError: too many values to unpack (expected 4).

Expected result:

The pre-commit hook should scan as normal.

Potential fix:

There are a few lines with the same problem, but to stay with line 372 of this particular error:

This line:

_, _, sha, path = line.split()

should be replaced with something like:

_, _, sha, path = line.split(maxsplit=3)

This will stop Python from splitting on spaces etc. in the path.