Open andreineculau opened 9 months ago
Hi, I have done some testing and the problem is triggered by the presence of quote characters ("
) in filenames, how these quotes are output by Git commands and then fed into follow-on commands within Transcrypt.
At multiple places in the Transcrypt script we use a command like this to find encrypted files (tracked files to which the crypt
filter is applied):
git -c core.quotePath=false ls-files | git -c core.quotePath=false check-attr --stdin filter | awk 'BEGIN { FS = ":" }; /crypt$/{ print $1 }'
These commands try to avoid problems with special/unusual characters in filenames by disabling Git's core.quotePath
config setting, but unfortunately per the documentation:
Double-quotes, backslash and control characters are always escaped regardless of the setting of this variable.
Running the above command on a repo containing files with quote characters produces output like this:
"ismdou - Know the identifiers of the table \"FOO\".py.secret"
sensitive_file
It is because the ismdou… file gets quoted and quote-escaped in this way that follow-up commands fail.
Specifically for your reported issue, the pre-commit hook is failing because it's passing the quoted file path/name to a command git show :"${secret_file}"
to determine whether the file is properly encrypted, but this gets interpreted as the following which doesn't work: git show :'"ismdou - Know the identifiers of the table \"FOO\".py.secret"'
Although the pre-commit hook is the culprit it this case, the same problem manifests for the --list
command (shows over-quoted output) and the --show-raw
command (will not work if you name the file, and also fails if you use the wildcard --show-raw=*
option):
./transcrypt --show-raw=*
==> "ismdou - Know the identifiers of the table \"FOO\".py.secret" <==
fatal: path '"ismdou - Know the identifiers of the table \"FOO\".py.secret"' does not exist in 'HEAD'
So I can tell what the problem is, but how to fix it is less clear. We need to do one of:
git show
commands. I doubt there is any sensible way to do this in bash.I have experimented with using the -z
option instead of -c core.quotePath=false
to identify encrypted files via ls-files and check-attr Git commands. The -z
option use NUL to delimit filenames, instead of newlines, but importantly avoids quoting file names at all (emphasis mine):
Without the
-z
option, pathnames with "unusual" characters are quoted as explained for the configuration variablecore.quotePath
(see git-config[1]). Using-z
the filename is output verbatim and the line is terminated by a NUL byte.
The trick will be figuring out how to string together the commands to work with NUL-delimited outputs, especially given that the command sequence relies on adding – then removing – the suffix : filter: crypt
to the same line as the original filename of encrypted files.
So far I've gotten as far as the following, which avoids the unwanted filename quoting using -z
with the first ls-files
command, but then unfortunately adds it back in with the following check-attr
command:
git ls-files -z | awk 'BEGIN { RS = "\0" }; { print $0 }' | git check-attr --stdin filter
I suspect a proper fix will require replacing the single-line piped commands with a function instead that will iterate over unquoted filenames (thanks to the -z
option), run the check-attr
command on each one individually to identify encrypted files, then add the unquoted filename of just the encrypted files to an output string.
I've started working towards a fix for this in PR #174
In my manual testing the PR works for most situations, but tests are failing for a few use-cases that have been broken by my changes.
The potential fix on branch 173-handle-quotes-in-filenames
is now passing all unit tests and works for my manual testing of file names containing double-quotes. Can you try it and see if it works for you @andreineculau?
Somewhere we don't escape filenames properly, ending up with errors like
The first fatal is due to double quotes in the filename. The rest are due to colon in the filename (not visible, but right after the path in the error message; filename is truncated in the error message).
PS: The worst of it is that the commit goes through, so you end up with a commit that should have encrypted files, but instead files are in plain text.