llvm / llvm-project

The LLVM Project is a collection of modular and reusable compiler and toolchain technologies.
http://llvm.org
Other
29.13k stars 12.02k forks source link

[Crash][UNREACHABLE] Didn't find this decl on its identifier's chain! #54279

Open Sameeranjoshi opened 2 years ago

Sameeranjoshi commented 2 years ago

While compiling the application dealii[1] with trunk, I am getting an error in the semantic phase of Clang.

TL;DR version

Didn't find this decl on its identifier's chain!
UNREACHABLE executed at ${HOME}/llvm-project/clang/lib/Sema/IdentifierResolver.cpp:82!

Attaching the bug report, script, and preprocessed sources as

Not sure if GitHub is not allowing to upload the files, but if needed I can attach or paste text. [1] https://github.com/dealii/dealii

llvmbot commented 2 years ago

@llvm/issue-subscribers-clang-frontend

Sameeranjoshi commented 2 years ago

Can there be some traction on solving this issue ? Do I need to assign this to someone ? I am probably missing out some information after the migration from bugzilla.

steakhal commented 2 years ago

I'm also interested in a similar crash. For my case, here is a reproducer: https://godbolt.org/z/x7b93YfG8

template <typename T> void f(const T &a, int c = 0);
template <> void f(const int &unused, int) {
  f(42);
}

My crash seems to be introduced by a5569f089844209dbea2e3241460173d7b6b1420 back in Jul. 9 2020 by Richard Smith (Push parameters into the local instantiation scope before instantiating a default argument).

@AaronBallman Do you know anyone who might be interested in this?

AaronBallman commented 2 years ago

The issue has to do with the default parameter (removing it or making it mandatory causes the issue to disappear). From my debugging of this, we do:

Relating to the primary template declaration of f: ActOnTypeParameter calls IdResolver.AddDecl for "T" ActOnParamDeclarator calls IdResolver.AddDecl for "a" ActOnParamDeclarator calls IdResolver.AddDecl for "c" ActOnPopScope (coming from ParseDirectDeclarator) calls IdResolver.RemoveDecl for "a" ActOnPopScope (coming from ParseDirectDeclarator) calls IdResolver.RemoveDecl for "c" PushOnScopeChains (coming from HandleDeclarator) calls IdResolver.AddDecl for "f" ActOnPopScope (coming from MultiParseScope::Exit which itself comes from ParseTemplateDeclarationOrSpecialization) calls IdResolver.RemoveDecl for "T"

Relating to the template specialization of f: ActOnParamDeclarator calls IdResolver.AddDecl for "unused" ActOnPopScope (coming from ParseDirectDeclarator) calls IdResolver.RemoveDecl for "unused" PushOnScopeChains (coming from ActOnStartOfFunctionDecl) calls IdResolver.AddDecl for "unused" ActOnPopScope (coming from ParseFunctionStatementBody) calls IdResolver.RemoveDecl for "a" <--- the WTF is here

AaronBallman commented 2 years ago

FWIW, this looks suspicious to me: https://github.com/llvm/llvm-project/blob/main/clang/lib/Sema/SemaDecl.cpp#L14488

Every other call to IdResolver.AddDecl is paired with a call to AddDecl on the scope object as well, except for this case. However, when I pair them up together, the assertion moves around rather than goes away. It may be an orthogonal issue, or it may be related.

Either way, it seems like we're somehow getting the declaration for a in the wrong scope and then trying to remove it from the IdResolver explodes.

AaronBallman commented 2 years ago

Either way, it seems like we're somehow getting the declaration for a in the wrong scope and then trying to remove it from the IdResolver explodes.

It may be more deeply strange than that. I put breakpoints in Scope::AddDecl, IdentifierResolver::AddDecl, and IdentifierResolver::RemoveDecl (Scope::RemoveDecl also had a breakpoint was never called):

Scope::AddDecl for unused 0x00000259e63be820
IdentifierResolver::AddDecl for unused 0x00000259e63be820

Scope::AddDecl for <Empty> 0x00000259e63be8a0
IdentifierResolver::AddDecl not called (might be questionable behavior)

IdentifierResolver::RemoveDecl for unused 0x00000259e63be820

Scope::AddDecl for unused 0x00000259e63be820
IdentifierResolver::AddDecl for unused 0x00000259e63be820

IdentifierResolver::RemoveDecl for a 0x00000259e63be820

Note how the name changed but the pointer value remained the same for the last three entries.

steakhal commented 2 years ago

Interesting. Thanks @AaronBallman for going the extra mile! I'll see if I can make progress on this. No promises though.

steakhal commented 2 years ago

Either way, it seems like we're somehow getting the declaration for a in the wrong scope and then trying to remove it from the IdResolver explodes.

It may be more deeply strange than that. I put breakpoints in Scope::AddDecl, IdentifierResolver::AddDecl, and IdentifierResolver::RemoveDecl (Scope::RemoveDecl also had a breakpoint was never called):

Scope::AddDecl for unused 0x00000259e63be820
IdentifierResolver::AddDecl for unused 0x00000259e63be820

Scope::AddDecl for <Empty> 0x00000259e63be8a0
IdentifierResolver::AddDecl not called (might be questionable behavior)

IdentifierResolver::RemoveDecl for unused 0x00000259e63be820

Scope::AddDecl for unused 0x00000259e63be820
IdentifierResolver::AddDecl for unused 0x00000259e63be820

IdentifierResolver::RemoveDecl for a 0x00000259e63be820

Note how the name changed but the pointer value remained the same for the last three entries.

Interesting. By debugging I can see that when the parser reaches the call to f() within the body of the function template specialization, it gathers the call arguments. As such, it finds that it needs to instantiate a default argument there. The important call-stack is: 1) BuildCXXDefaultArgExpr() 2) CheckCXXDefaultArgExpr() /// Instantiate or parse a C++ default argument expression as necessary. 3) InstantiateDefaultArgument() 4) SubstDefaultArgument() /// Substitute the given template arguments into the default argument. 5) addInstantiatedParametersToScope()

This last function (5) will eventually call FunctionParam->setDeclName(PatternParam->getDeclName()); which will overwrite the name of the parameter from unused to a. This will cause a discrepancy when we get to leave the function template specialization's body and look for a decl with the name a for removal.

BTW by debugging this, here is an even shorter reproducer :D

template <typename T> void f(const T &a = 0);
template <> void f(const int &b)
{ // AddDecl(ParmVarDecl 'b')
  f<int>(); // addInstantiatedParametersToScope() overwrites the name of the ParmVarDecl to 'a'.
} // RemoveDecl(ParamVarDecl 'a'), crash!

I struggle to see why addInstantiatedParametersToScope() modifies the ParmVarDecl. Maybe that shouldn't be called? IDK. Maybe @cor3ntin or someone else has some idea.

erichkeane commented 2 years ago

The idea of addInstantiatedParametersToScope is so that we can instantiate the body of the function, and refer to the parameters based on the 'old' var decls. That is, if the body of 'f' contained a reference to the version in the primary template, it would get the correct parameter, post instantiation (there is a Decl to Decl association container).

The goofy part there is either the explicit instantiation (I didn't think we needed to do addInstantiatedParametersToScope when explicitly instantiating?!), or the recursive instantiation there. Typically we'd notice that f is already being instantiated and skip it I think?

Its ALSO strange that we're renaming the variable, even in cases where it already has a name? It might be interesting to see what happens if we only do that if it is unnamed? ALSO-ALSO strange that our association is by name, which we are typically pretty smart about being mutable.

EDIT: AH, wait, the storing by name isn't weird, when we consider that is an Identifier Resolver, intending to get from the identifer to the variable. So something else is amiss here.

shafik commented 1 year ago

CC @zygoloid your commit was mentioned above: https://github.com/llvm/llvm-project/commit/a5569f089844209dbea2e3241460173d7b6b1420

ChrisZRJ commented 4 days ago

The idea of addInstantiatedParametersToScope is so that we can instantiate the body of the function, and refer to the parameters based on the 'old' var decls. That is, if the body of 'f' contained a reference to the version in the primary template, it would get the correct parameter, post instantiation (there is a Decl to Decl association container).

The goofy part there is either the explicit instantiation (I didn't think we needed to do addInstantiatedParametersToScope when explicitly instantiating?!), or the recursive instantiation there. Typically we'd notice that f is already being instantiated and skip it I think?

Its ALSO strange that we're renaming the variable, even in cases where it already has a name? It might be interesting to see what happens if we only do that if it is unnamed? ALSO-ALSO strange that our association is by name, which we are typically pretty smart about being mutable.

EDIT: AH, wait, the storing by name isn't weird, when we consider that is an Identifier Resolver, intending to get from the identifer to the variable. So something else is amiss here.

I had the same problem on dealii. The original commit https://github.com/llvm/llvm-project/commit/a5569f089844209dbea2e3241460173d7b6b1420 seems to fix the scenario of "Default arguments refer to parameters of the same function". In this case, the result of Expr *PatternExpr = Param->getUninstantiatedDefaultArg(); is the kind of CXXUnresolvedConstructExpr, and the above case is not.

So i add a conditon isa<CXXUnresolvedConstructExpr>(PatternExpr) berfore calling the addInstantiatedParametersToScope, it seems to work.

I am wondering is this enough to make addInstantiatedParametersToScope work only under the idea of https://github.com/llvm/llvm-project/commit/a5569f089844209dbea2e3241460173d7b6b1420.