Ericsson / clang

Cross Translation Unit analysis capability for Clang Static Analyzer. (Fork of official clang at http://llvm.org/git/clang)
http://clang.llvm.org/
Other
15 stars 10 forks source link

Import implicit methods of CXXRecordDecl. #409

Closed balazske closed 6 years ago

balazske commented 6 years ago

For some (unknown) reason at import of a record definition there may be an existing definition of the same record that does not contain every method. It is possible that automatically generated operators or destructors are missing, for example. The correction does an import of everything if an existing record is found and the existing and to-be-imported is a definition. (Hopefully the import works in every case for the already existing items without duplicating these.)

balazske commented 6 years ago

There are two variants of a struct that is to be imported: One with automatic function definitions, one without these. The source location is now the same for both (same header, same location). Source code is only a simple struct with field members, nothing else.

The first already existing one (before first import) has declarations only for copy constructor, move constructor, move operator =.

The other one (imported for first time) contains the same as above, and these declarations are duplicated here (with a Prev pointing to the other one). And there is a destructor, default constructor, copy operator= that have an inline body with definition (only one instance).

After the import of the latter into first, result is the same. More imports follow, sometimes with the smaller (without automatic definitions) variant. After the imports the chains of duplicated declarations for copy, move constructor and move operator= become longer, up to 4 instances at the end. Destructor, default constructor, copy operator= with inline body are not duplicated.

This means that for methods without body redundant redeclaration chains are generated at import. This works for normal member functions too.

martong commented 6 years ago

So you say that by merging the redecl chains will grow? I think that is ok. Or will there be redundant redecl chains for the same function? That is not ok. Could you please add more asserts/expects about the redecl chain(s)?

balazske commented 6 years ago

The AST looks like this, after imports:

CXXRecordDecl 0x1dfc270 <Xxx.h:121:3, line:128:3> line:121:10 referenced struct Xxx definition
|-DefinitionData standard_layout can_const_default_init
| |-DefaultConstructor exists non_trivial user_provided
| |-CopyConstructor simple non_trivial has_const_param needs_overload_resolution implicit_has_const_param
| |-MoveConstructor exists simple non_trivial needs_overload_resolution
| |-CopyAssignment non_trivial has_const_param implicit_has_const_param
| |-MoveAssignment exists simple non_trivial needs_overload_resolution
| `-Destructor simple non_trivial
|-CXXConstructorDecl 0x1e26430 <col:10> col:10 implicit Xxx 'void (const Xxx &)' inline default noexcept-unevaluated 0x1e26430
| `-ParmVarDecl 0x1e26560 <col:10> col:10 'const Xxx &'
|-CXXConstructorDecl 0x1e26d80 <col:10> col:10 implicit Xxx 'void (Xxx &&)' inline default noexcept-unevaluated 0x1e26d80
| `-ParmVarDecl 0x1e26eb0 <col:10> col:10 'Xxx &&'
|-CXXMethodDecl 0x1e26f48 <col:10, <invalid sloc>> col:10 implicit operator= 'Xxx &(Xxx &&)' inline default noexcept-unevaluated 0x1e26f48
| `-ParmVarDecl 0x1e27070 <col:10> col:10 'Xxx &&'
|-CXXConstructorDecl 0x35ce6e8 prev 0x1e26430 <col:10> col:10 implicit Xxx 'void (const Xxx &)' inline noexcept-unevaluated 0x1e26430
| `-ParmVarDecl 0x35ce688 <col:10> col:10 'const Xxx &'
|-CXXConstructorDecl 0x35ce800 prev 0x1e26d80 <col:10> col:10 implicit Xxx 'void (Xxx &&)' inline noexcept-unevaluated 0x1e26d80
| `-ParmVarDecl 0x35ce7a0 <col:10> col:10 'Xxx &&'
|-CXXMethodDecl 0x35ce918 prev 0x1e26f48 <col:10, <invalid sloc>> col:10 implicit operator= 'Xxx &(Xxx &&)' inline noexcept-unevaluated 0x1e26f48
| `-ParmVarDecl 0x35ce8b8 <col:10> col:10 'Xxx &&'
|-CXXDestructorDecl 0x35ce9c0 <col:10> col:10 implicit used ~Xxx 'void () noexcept' inline
| `-CompoundStmt 0x35cea70 <col:10>
|-CXXConstructorDecl 0x35cea80 <col:10> col:10 implicit used Xxx 'void () noexcept' inline
| |-CXXCtorInitializer Field 0x1e25588 'fff2' 'std::shared_ptr<Yyy>':'std::shared_ptr<Yyy>'
| | `-CXXConstructExpr 0x4053168 <col:10> 'std::shared_ptr<Yyy>':'std::shared_ptr<Yyy>' 'void () noexcept'
| `-CompoundStmt 0x40531c8 <col:10>
|-CXXMethodDecl 0x40532b8 <col:10> col:10 implicit used operator= 'Xxx &(const Xxx &) noexcept' inline
| |-ParmVarDecl 0x4053258 <col:10> col:10 used 'const Xxx &'
| `-CompoundStmt 0x4054350 <col:10>
|   |-BinaryOperator 0x4053590 <col:10> 'uint8_t':'unsigned char' lvalue '='
|   | |-MemberExpr 0x40534e0 <col:10> 'uint8_t':'unsigned char' lvalue ->fff 0x1dfc498
|   | | `-CXXThisExpr 0x40534c8 <col:10> 'Xxx *' this
|   | `-ImplicitCastExpr 0x4053578 <col:10> 'uint8_t':'unsigned char' <LValueToRValue>
|   |   `-MemberExpr 0x4053540 <col:10> 'const uint8_t':'const unsigned char' lvalue .fff 0x1dfc498
|   |     `-DeclRefExpr 0x4053518 <col:10> 'const Xxx' lvalue ParmVar 0x4053258 '' 'const Xxx &'
|   |-CXXMemberCallExpr 0x40542d0 <col:10> 'std::shared_ptr<Yyy>' lvalue
|   | |-MemberExpr 0x4054220 <col:10> '<bound member function type>' .operator= 0x40536a8
|   | | `-MemberExpr 0x40535d0 <col:10> 'std::shared_ptr<Yyy>':'std::shared_ptr<Yyy>' lvalue ->fff2 0x1e25588
|   | |   `-CXXThisExpr 0x40535b8 <col:10> 'Xxx *' this
|   | `-MemberExpr 0x4054298 <col:10> 'const std::shared_ptr<Yyy>':'const std::shared_ptr<Yyy>' lvalue .fff2 0x1e25588
|   |   `-DeclRefExpr 0x4054270 <col:10> 'const Xxx' lvalue ParmVar 0x4053258 '' 'const Xxx &'
|   `-ReturnStmt 0x4054338 <col:10>
|     `-UnaryOperator 0x4054318 <col:10> 'Xxx' lvalue prefix '*'
|       `-CXXThisExpr 0x4054300 <col:10> 'Xxx *' this
|-CXXConstructorDecl 0x5209de0 prev 0x35ce6e8 <col:10> col:10 implicit Xxx 'void (const Xxx &)' inline noexcept-unevaluated 0x1e26430
| `-ParmVarDecl 0x5209d80 <col:10> col:10 'const Xxx &'
|-CXXConstructorDecl 0x5209ef8 prev 0x35ce800 <col:10> col:10 implicit Xxx 'void (Xxx &&)' inline noexcept-unevaluated 0x1e26d80
| `-ParmVarDecl 0x5209e98 <col:10> col:10 'Xxx &&'
|-CXXMethodDecl 0x520a010 prev 0x35ce918 <col:10, <invalid sloc>> col:10 implicit operator= 'Xxx &(Xxx &&)' inline noexcept-unevaluated 0x1e26f48
| `-ParmVarDecl 0x5209fb0 <col:10> col:10 'Xxx &&'
|-CXXRecordDecl 0x1dfc398 <col:3, col:10> col:10 implicit struct Xxx
|-FieldDecl 0x1dfc498 <line:124:5, col:13> col:13 fff 'uint8_t':'unsigned char'
|-FieldDecl 0x1e25588 <line:125:5, col:33> col:33 fff2 'std::shared_ptr<Yyy>':'std::shared_ptr<Yyy>'
|-CXXConstructorDecl 0x604bfa0 prev 0x5209de0 <line:121:10> col:10 implicit Xxx 'void (const Xxx &)' inline noexcept-unevaluated 0x1e26430
| `-ParmVarDecl 0x604bf40 <col:10> col:10 'const Xxx &'
|-CXXConstructorDecl 0x604c0b8 prev 0x5209ef8 <col:10> col:10 implicit Xxx 'void (Xxx &&)' inline noexcept-unevaluated 0x1e26d80
| `-ParmVarDecl 0x604c058 <col:10> col:10 'Xxx &&'
`-CXXMethodDecl 0x604c1d0 prev 0x520a010 <col:10, <invalid sloc>> col:10 implicit operator= 'Xxx &(Xxx &&)' inline noexcept-unevaluated 0x1e26f48
  `-ParmVarDecl 0x604c170 <col:10> col:10 'Xxx &&'

If there is an other member function declaration (no body) the dump fails with "unsortable locations found" after this function is printed. The test on other projects produced failures too.

martong commented 6 years ago

This is how it looks before or after your fix?

balazske commented 6 years ago

After the fix. Before the fix this is present (and after the import this remains because the already existing definition is used):

CXXRecordDecl 0x1dfc270 <Xxx.h:121:3, line:128:3> line:121:10 referenced struct Xxx definition
|-DefinitionData standard_layout can_const_default_init
| |-DefaultConstructor exists non_trivial user_provided
| |-CopyConstructor simple non_trivial has_const_param needs_overload_resolution implicit_has_const_param
| |-MoveConstructor exists simple non_trivial needs_overload_resolution
| |-CopyAssignment non_trivial has_const_param implicit_has_const_param
| |-MoveAssignment exists simple non_trivial needs_overload_resolution
| `-Destructor simple non_trivial
|-CXXConstructorDecl 0x1e26430 <col:10> col:10 implicit Xxx 'void (const Xxx &)' inline default noexcept-unevaluated 0x1e26430
| `-ParmVarDecl 0x1e26560 <col:10> col:10 'const Xxx &'
|-CXXConstructorDecl 0x1e26d80 <col:10> col:10 implicit Xxx 'void (Xxx &&)' inline default noexcept-unevaluated 0x1e26d80
| `-ParmVarDecl 0x1e26eb0 <col:10> col:10 'Xxx &&'
|-CXXMethodDecl 0x1e26f48 <col:10, <invalid sloc>> col:10 implicit operator= 'Xxx &(Xxx &&)' inline default noexcept-unevaluated 0x1e26f48
| `-ParmVarDecl 0x1e27070 <col:10> col:10 'Xxx &&'
|-CXXRecordDecl 0x1dfc398 <col:3, col:10> col:10 implicit struct Xxx
|-FieldDecl 0x1dfc498 <line:124:5, col:13> col:13 fff 'uint8_t':'unsigned char'
|-FieldDecl 0x1e25588 <line:125:5, col:33> col:33 fff2 'std::shared_ptr<Yyy>':'std::shared_ptr<Yyy>'
martong commented 6 years ago

Ok, now I see what is happening.

I understand that technically merging the two definitions is possible and the redecl chain of each method will be handled properly (because we handle the redecl chain in case of any functions properly).

But, this kind of merging could result a very obfuscated AST for classes and methods. This is why I think first we should understand how is that possible that one definition of a class contains different methods than an other (because that is the reason which forces us into this fix).

Once we know that then we can consider this as a fix or we could come up with other solutions. Also, in case of free functions we build up a similar long redecl chain, but there we don't have any other options. If it turns out that this merge is the only solution then we should implement the squashing of redundant function declarations as a next step.

balazske commented 6 years ago

Code was now updated, now we get no redecl chain. Still improvements are needed for friend and friend-template cases. Some of the tests fails because functionality is changed:

ImportFunctions.ImportPrototypeThenProtoAndDefinition
ImportFunctions.ImportPrototypeThenPrototype
ImportFunctions.ImportPrototypes
ImportFunctions.PrototypeAfterPrototype
martong commented 6 years ago

Building the whole redecl chain in case of free functions was deliberate. We know that once we should squash the redundant prototypes. But that didn't look very important and I think that should be in a totally separate pull request with several number of different tests to verify and document what exactly we do with redundant prototypes. Now with this PR you are trying to coalesce two improvements and I am still not convinced that squashing the prototypes is the fix for the missing destructor. As I mentioned previously I consider finding the cause of the different class definitions more important than having a quick fix.

martong commented 6 years ago

For some (unknown) reason at import of a record definition there may be an existing definition of the same record that does not contain every method.

I think we should find that reason.

balazske commented 6 years ago

The problem is that the default functions are lazy-created. If an existing class do not contain and the imported version of it does contain these, they will not be imported and not created. To create these, a Sema instance is needed that is not available in ASTImporter.

balazske commented 6 years ago

Code update: Now the "implicit" functions are imported only, when importing a record definition into an existing definition.

balazske commented 6 years ago

FIXME updated, tests updated for last ctu-clang6.

balazske commented 6 years ago

Upstream revision: https://reviews.llvm.org/D49245

balazske commented 6 years ago

Backport from upstream: #438