Closed ivafanas closed 3 years ago
Thank you for submitting this!
From a quick glimpse this looks OK (and familiar!) This is a fairly deep part of Clang, so I would like to double-check. Let me get back to you over the weekend.
In the meantime, I updated my Github Actions to run on PR. I'm not sure how to trigger it for this branch. Perhaps you need to rebase?
In general, this change makes sense to me. However, I'm not sure why it affects CodeStyleCheckerMacro.cpp
? For CodeStyleCheckerMacro.cpp
, the presumed and spelling locations should lead to the same FileID
, shouldn't they? So it shouldn't matter which location is used. Any ideas?
I belive that reason is in Preprocessor::CreateString
function which is called during macro expansion.
https://clang.llvm.org/doxygen/Preprocessor_8cpp_source.html#l00494
It uses in-memory buffer to generate source code after macro expansion here (line 491):
SourceLocation Loc = ScratchBuf->getToken(Str.data(), Str.size(), DestPtr);
Implementation of ScratchBuf
creation is here:
https://clang.llvm.org/doxygen/ScratchBuffer_8cpp_source.html#l00067
Variable Loc
is used as SpellingLoc
argument for the final token location (Preprocessor.cpp lines 494-495):
Loc = SourceMgr.createExpansionLoc(Loc, ExpansionLocStart,
ExpansionLocEnd, Str.size());
See SourceManager::createExpansionLoc
interface:
SourceLocation
SourceManager::createExpansionLoc(SourceLocation SpellingLoc,
SourceLocation ExpansionLocStart,
SourceLocation ExpansionLocEnd,
unsigned TokLength,
bool ExpansionIsTokenRange,
int LoadedID,
unsigned LoadedOffset)
So, token spelling location is an in-memory "scratch space" and expansion location is in the main translation unit.
I'm a novice in clang and have no idea is this behavior correct or not. Going to compile clang from sources to re-check this investigation on practice.
Grep by "scratch space" over clang codebase and implementation of SourceManager::isInMainFile
(specifically SourceManager::getDecomposedExpansionLocSlowCase
) shows that it is well-known behavior for clang engineers.
Thank you for chasing this! And I really appreciate your detailed and comprehensive explanation. This is all very interesting stuff.
I was aware of the difference between the spelling and the expansion location, but didn't realize that there is also PresumedLoc
. Initially, I thought that
expansion location == presumed location
Only after you replied and I started scanning Clang, I realized these are completely different. There is an incredibly helpful explanation in Create a working compiler with the LLVM framework, Part 2 (search for presumed location).
IIUC, PresumedLoc
corresponds to the GNU line
directive. If there's no line
directive then indeed
presumed location == expansion location
And as you explained above, a SourceLocation
that corresponds to a macro spelling location lives in a scratch buffer. SM.getFileID(Decl->getLocation())
returns a FileID
that corresponds to that buffer rather than the actual source file. That's why CodeStyleCheckerMacro.cpp
was effectively failing.
Is this consistent with your understanding? Your patch basically makes sure that for macros the plugin uses presumed rather than spelling locations.
Would you mind adding a comment that explains why isInMainFile
is used? Be as generous or sparse as you see fit - I just think our future selves will be thankful for any indication that it's there for a good reason.
Hi,
I've re-read explanation from IBM paper several times and think that this explanation is a simplification for reader understanding:
/// The expansion location is the line in the source code where the macro
/// was expanded (the return statement), the spelling location is the
/// location in the source where the macro was originally defined,
/// and the presumed location is where the line directive states that
/// the line is 17, or any other line.
CodeStyleCheckerMacro.cpp
test doesn't have line
gcc directive so presumed location should not come into play.I suppose that comment is a simplification and preprocessor behavior is a well-known issue for compiler engineers.
Comment on SourceManager::isInMainFile
is added to CodeStyleChecker.h
.
Ivan.
Seems like expansion location ("presumed" location) should be used instead of spelling location for the check against main translation unit in CodeStyleChecker for macro expansion scenario.
I'm not 100% sure, could you please confirm.