llvm / llvm-project

The LLVM Project is a collection of modular and reusable compiler and toolchain technologies.
http://llvm.org
Other
29.09k stars 12k forks source link

[clang-format] .clang-format-ignore does not match `/` with `*` wildcard #110160

Open ameerj opened 1 month ago

ameerj commented 1 month ago

When using a .clang-format-ignore file, a * wildcard is expected to match a directory separator. For example, a line containing *g_* should match any file in the current directory and all subdirectories that match the pattern *g_*

Currently, the * does not match the path separator. So a *g_* in the .clang-format-ignore file will only match this pattern for the current working directory. There does not seem to be a way to match the pattern for the current dir and subdirs.

Looking into the source code, it seems that this behavior is from the matchFilePath function, which is based on the POSIX fnmatch() function https://github.com/llvm/llvm-project/blob/d4d38bcc3931d28d8b97d055258e27772119d0fe/clang/lib/Format/MatchFilePath.cpp#L10-L11

All of the following tests added to MatchFilePathTest.cpp fail:

  EXPECT_TRUE(match("some/nested/dir/file.cpp", "*nested*"));
  EXPECT_TRUE(match("some/nested/dir/file.cpp", "*/nested/*"));
  EXPECT_TRUE(match("some/nested/dir/g_file.cpp", "*g_*"));
  EXPECT_TRUE(match("some/nested/dir/g_file.cpp", "*/g_*"));

The fnmatch() function matches these cases: https://godbolt.org/z/6qxYW41Yv

@owenca is this expected behavior?

P.S. I am seeing this behavior on Windows/MSVC, not sure if this affects all environments or is Windows specific.

owenca commented 1 month ago

This is expected behavior and works as documented.

Looking into the source code, it seems that this behavior is from the matchFilePath function, which is based on the POSIX fnmatch() function

https://github.com/llvm/llvm-project/blob/d4d38bcc3931d28d8b97d055258e27772119d0fe/clang/lib/Format/MatchFilePath.cpp#L10-L11

See the comments after that: https://github.com/llvm/llvm-project/blob/d4d38bcc3931d28d8b97d055258e27772119d0fe/clang/lib/Format/MatchFilePath.cpp#L22-L23

The fnmatch() function matches these cases: https://godbolt.org/z/6qxYW41Yv

Not if FNM_PATHNAME is set: https://godbolt.org/z/Gc8z3KceE

ameerj commented 1 month ago

Would you be open to supporting the double asterisk ** to match directory separators?

As used in gitignore

owenca commented 2 weeks ago

Would you be open to supporting the double asterisk ** to match directory separators?

As used in gitignore

See https://github.com/llvm/llvm-project/pull/110560#discussion_r1816084375.