codespell-project / actions-codespell

MIT License
74 stars 19 forks source link

gets to analyze .git/ files and "fail" on object files #56

Open yarikoptic opened 1 year ago

yarikoptic commented 1 year ago

I guess finds some uncompressed objects it identifies as text files and starts checking them

this basic workflow https://github.com/datalad/tutorials/pull/20/files#diff-ce84a1b2c9eb4ab3ea22f610cad7111cb9a2f66365c3b24679901376a2a73ab2 lead to

Resulting CLI options  --skip *.svg
Warning: WARNING: Decoding file using encoding=utf-8 failed: ./.git/objects/1e/a4a450299271c0a5e69bc79e1cb4a8771b34da
Error: ./.git/objects/1e/a4a450299271c0a5e69bc79e1cb4a8771b34da:755: ot ==> to, of, or, not
WARNING: Trying next encoding iso-8859-1
Warning: WARNING: Decoding file using encoding=utf-8 failed: ./.git/objects/78/f14c2775affc5ed4fcd819b66ff5bfd[15](https://github.com/datalad/tutorials/actions/runs/3696668759/jobs/6260675265#step:4:16)92806
WARNING: Trying next encoding iso-8859-1
Warning: WARNING: Decoding file using encoding=utf-8 failed: ./.git/objects/fb/52afd8fb6f45744569797c8ef1d1b7b1dbace0
WARNING: Trying next encoding iso-8859-1
3
Error: ./.git/objects/1e/a4a450299271c0a5e69bc79e1cb4a8771b34da:838: Te ==> The, Be, We, To
Error: ./.git/objects/1e/a4a450299271c0a5e69bc79e1cb4a8771b34da:911: nd ==> and, 2nd
Codespell found one or more problems

doesn't happen for me locally... but looking at strace -- indeed codespell does go under .git/ to "sniff around":

❯ strace -f -o /tmp/codespell.strace codespell --skip '*.svg' >/dev/null
❯ grep '\.git/objects' /tmp/codespell.strace
1918302 newfstatat(AT_FDCWD, "./.git/objects", {st_mode=S_IFDIR|0700, st_size=16, ...}, AT_SYMLINK_NOFOLLOW) = 0
1918302 openat(AT_FDCWD, "./.git/objects", O_RDONLY|O_NONBLOCK|O_CLOEXEC|O_DIRECTORY) = 3
1918302 newfstatat(AT_FDCWD, "./.git/objects/pack", {st_mode=S_IFDIR|0700, st_size=198, ...}, AT_SYMLINK_NOFOLLOW) = 0
1918302 openat(AT_FDCWD, "./.git/objects/pack", O_RDONLY|O_NONBLOCK|O_CLOEXEC|O_DIRECTORY) = 3
1918302 newfstatat(AT_FDCWD, "./.git/objects/pack/pack-5d46e9abf8de8270def5b70004db30dcaa17bda3.pack", {st_mode=S_IFREG|0400, st_size=97786, ...}, 0) = 0
1918302 openat(AT_FDCWD, "./.git/objects/pack/pack-5d46e9abf8de8270def5b70004db30dcaa17bda3.pack", O_RDONLY|O_CLOEXEC) = 3
1918302 newfstatat(AT_FDCWD, "./.git/objects/pack/pack-5d46e9abf8de8270def5b70004db30dcaa17bda3.idx", {st_mode=S_IFREG|0400, st_size=13756, ...}, 0) = 0
1918302 openat(AT_FDCWD, "./.git/objects/pack/pack-5d46e9abf8de8270def5b70004db30dcaa17bda3.idx", O_RDONLY|O_CLOEXEC) = 3
1918302 newfstatat(AT_FDCWD, "./.git/objects/info", {st_mode=S_IFDIR|0700, st_size=0, ...}, AT_SYMLINK_NOFOLLOW) = 0
1918302 openat(AT_FDCWD, "./.git/objects/info", O_RDONLY|O_NONBLOCK|O_CLOEXEC|O_DIRECTORY) = 3

ideally should be fixed in codespell itself (to not consider at least .git/objects, some scripts might be under .git/) but I think it should be safe to assume that codespell action for github should not check anything in .git/ since there should be nothing committed / to fix there.

edit: workaround -- add .git to --skip i.e.

❯ strace -f -o /tmp/codespell.strace codespell --skip '*.svg,.git' >/dev/null
❯ grep '\.git/' /tmp/codespell.strace

comes out empty handed

yarikoptic commented 1 year ago

ha -- and probably should ignore .github/workflows/codespell.yml or whatever action name could be which has ignore_words_list ;)