Closed louy2 closed 4 months ago
Thanks, but avoiding trouble with parsing special filenames would probably be better done by passing the "-z" option to check-ignore. If we do that, we would also need to split input paths with null characters rather than newline characters, and also split output on null characters rather than newline characters. Do you want to give that a shot?
I implemented the alternative using the -z
flag to check-ignore in commit 2800bcc1007e (clean-ignore: support utf-8 filenames found in .gitignore, 2024-07-02)
filter-repo callback passes unicode filename as utf_8 bytes, but
git check-ignore
prints unicode filename as quoted octal escaped utf_8 bytes, failing thename != pathname
check on CJK filenames..decode('unicode_escape')
decodes latin-1 bytes with escaped unicode, so it decodes the escaped bytes, but into a latin-1 str, therefore.encode('latin_1')
recovers the original bytes, which is utf_8, and is comparable to the filename passed by filter-repo callback.