Open alvinhochun opened 3 months ago
Even when I patched compile_commands.json
manually to include the full path to the toolchain, clangd still seems very confused:
-internal-isystem /opt/llvm-mingw-20240619-ucrt-ubuntu-20.04-x86_64/x86_64-w64-mingw32/include/c++
-internal-isystem /opt/llvm-mingw-20240619-ucrt-ubuntu-20.04-x86_64/x86_64-w64-mingw32/include/c++/x86_64-w64-mingw32
-internal-isystem /opt/llvm-mingw-20240619-ucrt-ubuntu-20.04-x86_64/x86_64-w64-mingw32/include/c++/backward
-internal-isystem /opt/llvm-mingw-20240619-ucrt-ubuntu-20.04-x86_64/x86_64-w64-mingw32/include/c++/
-internal-isystem /opt/llvm-mingw-20240619-ucrt-ubuntu-20.04-x86_64/x86_64-w64-mingw32/include/c++//x86_64-w64-mingw32
-internal-isystem /opt/llvm-mingw-20240619-ucrt-ubuntu-20.04-x86_64/x86_64-w64-mingw32/include/c++//backward
-internal-isystem /opt/llvm-mingw-20240619-ucrt-ubuntu-20.04-x86_64/include/c++/
-internal-isystem /opt/llvm-mingw-20240619-ucrt-ubuntu-20.04-x86_64/include/c++//x86_64-w64-mingw32
-internal-isystem /opt/llvm-mingw-20240619-ucrt-ubuntu-20.04-x86_64/include/c++//backward
-internal-isystem include/c++
-internal-isystem include/c++/x86_64-w64-mingw32
-internal-isystem include/c++/backward
-internal-isystem include/g++-v0.0.0
-internal-isystem include/g++-v0.0.0/x86_64-w64-mingw32
-internal-isystem include/g++-v0.0.0/backward
-internal-isystem include/g++-v0.0
-internal-isystem include/g++-v0.0/x86_64-w64-mingw32
-internal-isystem include/g++-v0.0/backward
-internal-isystem include/g++-v0
-internal-isystem include/g++-v0/x86_64-w64-mingw32
-internal-isystem include/g++-v0/backward
-internal-isystem /opt/llvm-mingw-20240619-ucrt-ubuntu-20.04-x86_64/lib/clang/18/include
-internal-isystem /opt/llvm-mingw-20240619-ucrt-ubuntu-20.04-x86_64/x86_64-w64-mingw32/include
-internal-isystem /opt/llvm-mingw-20240619-ucrt-ubuntu-20.04-x86_64/x86_64-w64-mingw32/usr/include
None of the paths include the c++/v1
subdirectory, so it still cannot find a lot of headers, for example cstdint
.
I'm not entirely sure all what's happening here, but I have a few observations at least.
When we invoke <arch>-w64-mingw32-clang
, the wrapper script/executable invokes the clang
binary with a couple of extra parameters - -target <arch>-w64-mingw32
, but also -stdlib=libc++
which may be relevant here.
When clangd sees an invocation of <arch>-w64-mingw32-clang
, it seems like it is able to deduce that this in general implies -target <arch>-w64-mingw32
. (Instead of a wrapper, one could also just make a symlink to the clang executable, and have it imply the target that way.) But clangd can't know that the wrapper also implies -stdlib=libc++
. Adding that to the command in compile_commands.json
might help (but that's not a proper fix though).
This explains the difference to the Windows build of llvm-mingw; in the Windows build, we have such parameters hardcoded in the clang binary (see https://github.com/mstorsjo/llvm-mingw/blob/master/build-llvm.sh#L242-L251). However I'm not sure if this aspect explains why it thinks that things are based in /usr
.
I'm not very familiar with the clangd code - I assume that it should run the target specific logic for setting up include paths in clang/lib/Driver/ToolChains/MinGW.cpp
just like it otherwise does, but perhaps there's some difference in how it decides what the base directory is?
See https://github.com/mstorsjo/llvm-mingw/issues/429#issuecomment-2154595788 for a discussion of similar issues with clang-tidy
, discussions on a new wrapper for clang-scan-deps
in https://github.com/mstorsjo/llvm-mingw/pull/425, and https://github.com/mstorsjo/llvm-mingw/pull/430 which reveals such hidden parameters if using CMake (but goes against the way this toolchain mostly is used).
Another way around this would be to use config files for setting most of these parameters. I made a PoC for that many years ago, perhaps I should revisit it and see how well it would work, compared to our wrappers, at this point.
Your guess is right, adding -stdlib=libc++
gets me this list:
-internal-isystem /usr/x86_64-w64-mingw32/include/c++/v1
-internal-isystem /usr/include/c++/v1
-internal-isystem /opt/llvm-mingw-20240619-ucrt-ubuntu-20.04-x86_64/lib/clang/18/include
-internal-isystem /usr/x86_64-w64-mingw32/include
-internal-isystem /usr/x86_64-w64-mingw32/usr/include
And adding the full path to the compiler executables does produce fully working system includes:
-internal-isystem /opt/llvm-mingw-20240619-ucrt-ubuntu-20.04-x86_64/x86_64-w64-mingw32/include/c++/v1
-internal-isystem /opt/llvm-mingw-20240619-ucrt-ubuntu-20.04-x86_64/include/c++/v1
-internal-isystem /opt/llvm-mingw-20240619-ucrt-ubuntu-20.04-x86_64/lib/clang/18/include
-internal-isystem /opt/llvm-mingw-20240619-ucrt-ubuntu-20.04-x86_64/x86_64-w64-mingw32/include
-internal-isystem /opt/llvm-mingw-20240619-ucrt-ubuntu-20.04-x86_64/x86_64-w64-mingw32/usr/include
I'm not very familiar with the clangd code - I assume that it should run the target specific logic for setting up include paths in
clang/lib/Driver/ToolChains/MinGW.cpp
just like it otherwise does, but perhaps there's some difference in how it decides what the base directory is?
The documentation does say it uses the toolchains in clang/lib/Driver/ToolChains/
to do the search.
It also says there is a --query-driver
mode where it can ask the compiler executable for the flags, which I suppose might fix the issue if the executables use full paths. But this requires manual setup so it is an inconvenient workaround.
I think it will be nice if clangd can have logic to automatically detect the compiler path in the same directory, then also have some mingw-specific logic to detect that the toolchain only ships with libc++ and automatically use that. Though I don't know if it can be added to clangd without affecting clang. Or perhaps clang could also benefit from this logic?
Hmm, so to get the full picture - when you generate your compile_commands.json
, you have the llvm-mingw tools added to $PATH
, and your build system, which generates the json file, just encode the commands with the plain executable names, without an absolute path - expecting to find them in $PATH
. And when you run clangd
, elsewhere, they don't exist in the $PATH
of clangd
?
I think this bit mostly is operating as expected actually.
I think it's correct design that clangd
doesn't assume that the base installation directory of clangd
would be related in any way to the base installation directory of the compilers used. In many cases, clangd
would be supplied entirely separately (e.g. installed as part of a VSCode plugin) from the actual toolchains that you may use. So I think it's actually an intentional design, that clangd
doesn't try to look for compiler executables in the directory it is installed in. I.e., if the tools aren't generally found in the $PATH
of clangd
, then the build system should generate compile_commands.json
with absolute paths. (E.g. CMake in general always uses absolute paths to the compilers.)
But it appears to work on Windows without the path. Why would there be a discrepancy?
But it appears to work on Windows without the path. Why would there be a discrepancy?
That's indeed odd. Perhaps the last-call default fallbacks differ - on Unix assuming /usr/bin
if nothing else is found, but assuming the current executable directory on Windows?
Just to be clear, Clangd relies on heuristics on some level to find stuff (it's stated in its documentation), so it seems to me that, if the compiler driver is not found on PATH (which usually includes /usr/bin
), the same directory itself should be a reasonable next best guess. After all, this will only have an effect if the copy of Clangd is included with a standalone toolchain, and it seems unlikely that a user will just randomly decide to use a Clangd supplied with a cross toolchain (if normal cross toolchains even come with Clangd). Or perhaps it can be added as an option when compiling Clangd.
But I guess this ultimately needs to be discussed with Clangd developers.
Just to be clear, Clangd relies on heuristics on some level to find stuff (it's stated in its documentation), so it seems to me that, if the compiler driver is not found on PATH (which usually includes
/usr/bin
), the same directory itself should be a reasonable next best guess. After all, this will only have an effect if the copy of Clangd is included with a standalone toolchain, and it seems unlikely that a user will just randomly decide to use a Clangd supplied with a cross toolchain (if normal cross toolchains even come with Clangd). Or perhaps it can be added as an option when compiling Clangd.
Yeah I guess that could be reasonable. But it would be interesting to dig into the code and see what it actually does and why the behaviour differs (or why it seems to find it on Windows - did you happen to have that path added when you started clangd there?)
did you happen to have that path added when you started clangd there?
Nope, I never add toolchains on PATH except in the terminals I actively use to compile things.
did you happen to have that path added when you started clangd there?
Nope, I never add toolchains on PATH except in the terminals I actively use to compile things.
Another theory could be if it finds a clang
binary in /usr/bin
and uses that as reference, but I'm not quite sure why it would look for that, when the actual executable mentioned in the json file is <triple>-clang
.
did you happen to have that path added when you started clangd there?
Nope, I never add toolchains on PATH except in the terminals I actively use to compile things.
Another theory could be if it finds a
clang
binary in/usr/bin
and uses that as reference, but I'm not quite sure why it would look for that, when the actual executable mentioned in the json file is<triple>-clang
.
Yes, this actually seems to be the case.
See https://github.com/llvm/llvm-project/blob/llvmorg-19.1.0-rc1/clang-tools-extra/clangd/CompileCommands.cpp#L88-L118 and https://github.com/llvm/llvm-project/blob/llvmorg-19.1.0-rc1/clang-tools-extra/clangd/CompileCommands.cpp#L154-L172
As far as I my experience goes - gnu(-like) compilers will assume they should use host system paths, unless you add --sysroot
, at which point the paths will be re-rooted there (among other things).
clangd will only look at the arguments, and not inspect the compiler AFAIK. If those are hidden then it can only use blind guesses.
This is a general issue with alot clang-tooling using compile_commands.json
(ex. Codechecker too).
if you use CMake as buildsystem, using toolchain files (#430) will ensure everything works.
The clangd binary that comes with llvm-mingw seems to not know where to look for include files. I tried with a Godot build tree that I cross-compiled using llvm-mingw, with the
compile_commands.json
generated by the build system available, using the Clangd extension on vscode with the path to clangd set to the one inside llvm-mingw, it isn't able to findwindows.h
.Running clangd check manually:
First of all, It seems clangd thinks the compiler executable is
/usr/bin/x86_64-w64-mingw32-clang++
, which does not exist. In thecompile_commands.json
it is only listed asx86_64-w64-mingw32-clang
(I only add llvm-mingw to path while building). This may not really be an issue, but I don't really know. Regardless, I was expecting clangd to be smart enough to see that these executables are right next to itself, but apparently not.Then, if we look at the list of system includes:
So it is looking for headers under
/usr
first. I was expecting it to be smart enough to use its built-in headers first, but I guess not.More surprising is the list of "relative paths" that seem broken. I wonder why they would be included...
Running the build command manually shows the correct system includes:
Listed:
So it seems to be a clangd issue.
This does not happen on Windows when the
compile_commands.json
also don't include an absolute path to the compiler:The list of system includes: (
D:\
gets changed toC:\d_stor
due to path canonicalization, which is another thing that bothers me but let's ignore that now):If you want to reproduce this, it may be simpler to craft a
compile_commands.json
file manually, but the build command I used on Linux isscons -j2 dev_build=yes platform=windows use_mingw=yes use_llvm=yes target=editor
.