Make Checked C's clangd stable enough to be useful

secure-sw-dev-bot commented 2 years ago

This issue was copied from https://github.com/microsoft/checkedc-clang/issues/1066

As you may know, the LLVM monorepo includes clangd, a C/C++ language server that provides rich functionality such as cross-references and live error checking to IDEs. Since clangd is based on the shared C parsing and semantic analysis code in the repository, and that code in the checkedc-clang repository has been modified to support Checked C, one would hope that the version of clangd in the checkedc-clang repository can be dropped into a compatible IDE (such as Visual Studio Code) to provide analogous rich functionality for editing Checked C code. And in our testing, it mostly worked, once we worked around two bugs that caused clangd to crash too frequently to be useful.

I found clangd to be tremendously helpful for writing Checked C code as part of CCI's porting work, and I think clangd would be of great value to Checked C users in general in the long term, so I'm asking you to consider putting some effort into maintaining it. Even if you only work with us to get our clangd-related PRs accepted, that would be a start: if you sometimes make changes to the shared code that break clangd, we can continue using the revision of clangd prior to the breakage, with the caveat that bugs that have later been fixed in the compiler may still affect our copy of clangd. If we can identify some set of clangd-related tests to add to your standard checks to avoid breaking clangd, that would be better. If you're willing to put some work into making clangd work well with Checked-C-specific features, that would be even better. If you can suggest a better place to continue the discussion about Checked C clangd in general, I'd be happy to go there.

Anyway, the current bugs:

Function return annotation reinjection

Opening the following Checked C code in Visual Studio Code configured to use Checked C clangd:

_Array_ptr<int> foo(void) : count(2);

crashes clangd with the following message:

ok Token(`foo`, identifier, length = 3)
ok Token(`(`, l_paren, length = 1)
ok Token(`void`, void, length = 4)
ok Token(`)`, r_paren, length = 1)
ok Token(`:`, colon, length = 1)
ok Token(`count`, identifier, length = 5)
ok Token(`(`, l_paren, length = 1)
ok Token(`2`, numeric_constant, length = 1)
ok Token(`)`, r_paren, length = 1)
ok Token(`;`, semi, length = 1)
!! Token(`(`, l_paren, length = 1)
   Token(`2`, numeric_constant, length = 1)
   Token(`)`, r_paren, length = 1)
   Token(``, eof, length = 0)
   Token(`;`, semi, length = 1)
Couldn't map expanded token to spelled tokens!
UNREACHABLE executed at /home/matt/3c/clang/lib/Tooling/Syntax/Tokens.cpp:765!

With the following Checked C code, the problem does not occur:

_Array_ptr<int> foo(void);

So the problem is triggered by the : count(2). It looks like the parser is doing something tricky: buffering these tokens and then going back and parsing them later. (I imagine you know much more about what the parser is doing and why than I do.) This causes a problem for clangd for reasons that I don't fully understand. But Clang has an API to mark the tokens as "reinjected", which avoids the problem and is already used in other similar cases. This is a one-line fix (#1067), although I'm unfamiliar with how to write an automated test for it.

Uninitialized `SourceLocation` in `TypeVariableTypeLoc`

The parser doesn't initialize the SourceLocation of a TypeVariableTypeLoc, so it contains a garbage value from uninitialized memory, which could cause an assertion failure in clangd with TypeLocReader::VisitTypeVariableTypeLoc on the call stack. See the comment in the regression test in #1081 for some additional information.

secure-sw-dev-bot commented 2 years ago

Comment from @sulekhark:

We discussed this internally: We do agree that clangd will be useful to Checked C users in the long run. However, there are many higher priority tasks in the compiler that we need to attend to. Therefore, here's what we will do: 1) We will work with you to get your clangd PRs for fixes and tests accepted. 2) We will integrate the clangd tests into our regular testing when possible (this is not a very high priority task for us) so that none of the compiler features and bug fixes will break clangd.

secure-sw-dev-bot commented 2 years ago

Comment from @mattmccutchen-cci:

Thanks Sulekha for your offer of help with clangd in general and for merging #1067.

I've now gone back and reproduced the second problem we saw (with TypeVariableTypeLoc) and added the missing information about it to the initial comment above. I also have a proposed fix in #1081, although I'm still planning to run more of the existing tests on that PR.

secure-sw-dev-bot commented 2 years ago

Comment from @mattmccutchen-cci:

With #1081 merged, I think the goal of adequate clangd stability has been reached as of now, so I'll close this issue and file separate issues as appropriate if we run into other severe problems with clangd. As a bonus, it looks like the clangd tests may have been added to the standard set of tests for this repository.

checkedc / checkedc-llvm-project