llvm / llvm-project

The LLVM Project is a collection of modular and reusable compiler and toolchain technologies.
http://llvm.org
Other
27.66k stars 11.38k forks source link

[clang-tidy] Create check for inconsistent/missed typedef/using alias type names #86714

Open FruitClover opened 5 months ago

FruitClover commented 5 months ago

Consider the following example: https://godbolt.org/z/jE8Yc7sz6

using Type = int;

struct Base {
  virtual Type fooType(Type x);
  virtual int fooInt(int x);
};

struct Derived : Base {
  int fooType(int x);
  Type fooInt(Type x);
};

Here we misuse/miss consistent alias names in derived methods (compared to base class signatures), which could lead to tricky bugs (after using Type = ... changes) and this also complicates plain text search.

The idea is that we could detect such using/typedef inconsistency in some cases with clang-tidy.

To start up, we could focus on:

To illustrate which cases are compatible and which are not - check the following table:

legend - `Base Type`: return/argument type for base class methods. - `Derived Type`: return/argument type for derived class methods. - `Warning/FixItHint`: if we should emit warning/hint (we use `int` as some non-aliased type, and `` as `using T = ..` or `typedef .. T`;
|     | Base Type | Derived Type | Warning | FixItHint |
|-----+-----------+--------------+---------+-----------|
|     | int       | int          | -       | -         |
|     | <alias>   | int          | +       | +         |
|     | int       | <alias>      | +       | -         |
| (*) | <alias>   | <alias>      | Analyze | -         |

(*) Both types are aliases: Analyze BaseT vs DerivedT:

  1. Alias names are compatible: (no warning) 1.1 One of (BaseT, DerivedT) is implicit 2.1 Same fully-qualified names:BaseT.getQualifiedNameAsString == DerivedT.getQualifiedNameAsString

  2. Alias names are non-compatible: (warning) 2.1. Strict check: TD1.getQualifiedNameAsString != TD2.getQualifiedNameAsString (N1::Type vs. N2::Type)

  3. Questinable: (Warning: ?) 3.1. Chain alias (maybe control with an option). If so, should check it before 2.1

    
    using T1 = int;
    using T2 = T1;

// => T1 name is compatible with T2 name



This probably belongs to `bugprone` category, something like `inconsistent-alias-names`/`inconsistent-using-typedef` (suggestions are welcome)

Background: PoC version was prepared for internal projects, here I want to check if clang-tidy would be interested in it before publishing PR, and discuss what to warn and if we should emit more fix hints.
llvmbot commented 5 months ago

@llvm/issue-subscribers-clang-tidy

Author: None (FruitClover)

Consider the following example: https://godbolt.org/z/jE8Yc7sz6 ```c++ using Type = int; struct Base { virtual Type fooType(Type x); virtual int fooInt(int x); }; struct Derived : Base { int fooType(int x); Type fooInt(Type x); }; ``` Here we misuse/miss consistent **alias names** in derived methods (compared to base class signatures), which could lead to tricky bugs (after `using Type = ...` changes) and this is also complicates plain text search. The idea is that we could detect such **`using`/`typedef`** inconsistency/misuse in some cases with clang-tidy. To start up, we could focus on: - Class methods only. - Return type check. - Argument type check. To illustrate which cases are compatible/which are not - check the following table: <details> <summary>legend</summary> - `Base Type`: return/argument type for class class methods. - `Derived Type`: return/argument type for derived class methods. - `Warning/FixItHint`: if check should emit warning/hint (we use `int` as some non-aliased type, and <alias> as `using T = ..` or `typedef .. T`; </details> ``` | | Base Type | Derived Type | Warning | FixItHint | |-----+-----------+--------------+---------+-----------| | | int | int | - | - | | | <alias> | int | + | + | | | int | <alias> | + | - | | (*) | <alias> | <alias> | Analyze | - | ``` `(*)` **Both types are aliased**: Analyze `BaseT` vs `DerivedT`: 1. Alias names are compatible: (**Warning: No**) 1.1 one of (`BaseT`, `DerivedT`) is implicit 2.1 Same fully-qualified names:`BaseT.getQualifiedNameAsString == DerivedT.getQualifiedNameAsString` 2. Alias names are non-compatible: (**Warning: Yes**) 2.1. Strict check: `TD1.getQualifiedNameAsString != TD2.getQualifiedNameAsString` (`N1::Type vs N2::Type`) 3. Questinable: (**Warning: ?**) 3.1. chain alias (maybe control with option). If so, should check it before 2.1 ```c++ using T1 = int; using T2 = T1; // => T1 name is compatible with T2 name ``` This probably belongs to `bugprone` category, something like `inconsistant-alias-names`/`inconsistant-using-typedef` (suggestions are welcome) Background: PoC version was prepared for internal projects, here I want to check if clang-tidy would be interested in it before publishing PR, and discuss what to warn and if we should emit more fix hints.
FruitClover commented 5 months ago

Example diagnostics with poc:

llvm-project/clang-tools-extra/clangd/index/Index.cpp:75:1: warning: Inconsistent alias name for 'clang::clangd::SwapIndex::indexedFiles' return type: Base return type is alias, but derived return type is not; Consider replacing with base type alias [bugprone-inconsistent-alias-names]
   75 | llvm::unique_function<IndexContents(llvm::StringRef) const>
      | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
      | clang::clangd::SymbolIndex::IndexedFiles
llvm-project/clang-tools-extra/clangd/index/Index.h:156:11: note: Base return type is here
  156 |   virtual IndexedFiles indexedFiles() const = 0;
      |           ^~~~~~~~~~~~
llvm-project/clang-tools-extra/clangd/index/Index.h:154:9: note: Alias for base return type: 'clang::clangd::SymbolIndex::IndexedFiles'
  154 |   using IndexedFiles =
      |   ~~~~~~^~~~~~~~~~~~~~
  155 |       llvm::unique_function<IndexContents(llvm::StringRef) const>;
      |       ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
llvm-project/mlir/lib/Dialect/SparseTensor/Transforms/Utils/SparseTensorLevel.cpp:73:3: warning: Inconsistent alias name for '(anonymous namespace)::DenseLevel::peekRangeAt' return type: Derived return type is alias, but base return type is not [bugprone-inconsistent-alias-names]
   73 |   ValuePair peekRangeAt(OpBuilder &b, Location l, Value p,
      |   ^~~~~~~~~
llvm-project/mlir/lib/Dialect/SparseTensor/Transforms/Utils/SparseTensorLevel.h:50:11: note: Base return type is here
   50 |   virtual std::pair<Value, Value> peekRangeAt(OpBuilder &b, Location l, Value p,
      |           ^~~~~~~~~~~~~~~~~~~~~~~
llvm-project/mlir/lib/Dialect/SparseTensor/Transforms/Utils/SparseTensorLevel.cpp:18:7: note: Alias for derived return type: 'ValuePair'
   18 | using ValuePair = std::pair<Value, Value>;
      | ~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
FruitClover commented 4 months ago

bump

5chmidti commented 4 months ago

I think a check like this would be nice. Your table of what you suggest to be diagnosing looks good as well. Only fixing the one case is the right way to go, in the other cases it may not be correct or good to fix the declarations (from a semantic point of view in their code-base, as well as the issue with where the alias is defined and if it is even available at the base class).

I think that a chain alias

using T1 = int;
using T2 = T1;

should basically be the same as

|     | Base Type | Derived Type | Warning | FixItHint |
|-----+-----------+--------------+---------+-----------|
|     | int       | <alias>      | +       | -         |

: just another level of sugar. There shouldn't be a difference in how to diagnose using int and T1 vs using T1 and T2. Now doing

using T1 = int;
using T2 = int;

and using T1 and T2 is a different matter, and should definitely be diagnosed.

The bugprone category sounds like the right place, the name could be a little bit more specific, or do you want to leave the check open to be extended with differences in forward declarations?

// header:
int foo();
using I = int;

// impl
I foo() {}

The name sounds good to me. I thought of inconsistent-use-of-alias-in-declaration right now, wdyt?

FruitClover commented 4 months ago

Thanks for the feedback @5chmidti !

Only fixing the one case is the right way to go, in the other cases it may not be correct or good to fix the declarations (from a semantic point of view in their code-base, as well as the issue with where the alias is defined and if it is even available at the base class).

Right, I should have mentioned this explicitly

I think that a chain alias ... just another level of sugar

Agree

the name could be a little bit more specific, or do you want to leave the check open to be extended with differences in forward declarations?

Yes, I intend to extend this check for the forward declarations (both return type and arguments), but not in the initial PR, so reflecting this in checker name would be great. inconsistent-use-of-alias-in-declaration sounds good to me, will use it.

idler66 commented 3 months ago

Does it mean that the methods with the same name will be suggested to have the same parameters and return types in Base and Derived? The Derived just overrides the Base without any errors.

FruitClover commented 3 months ago

Does it mean that the methods with the same name will be suggested to have the same parameters and return types in Base and Derived? The Derived just overrides the Base without any errors.

Right, there are no compiler errors (underlying types are the same), but we could have different meaning/semantic due to different alias names, for example:

using MemoryType = int;
using EdgeType = int;

struct B { virtual MemoryType foo(); };
struct D : B {       EdgeType foo() override; };

which could lead to false assumptions for the users of the derived class.

will be suggested to have the same parameters and return types

That's the tricky part, so currently there is only 1 replacement suggestion (out of 4 cases), where the rest are "please check if this is intentional" type warnings (which we probably could control with options and sane defaults).

idler66 commented 3 months ago

From my perspective, warnings don't work well here. When the tricky situations arise, we can force users to make their intentions clear by forcing users to add some comments or extend method declarations to tell the compiler that function overriding is required, otherwise an error message will be given. It's about the entire function declaration, not just the type. It is a good idea to send a proposal to std-proposals@lists.isocpp.org.

firewave commented 3 months ago

Sorry if this is going off track.

As C/C++ do not have "strong typedef" I have been looking forward to such a check as it could be used to enforce type safety across compatible POD types.

The case here is just about overloads with wrong aliases but this could be spun further to all usages.

There is also bugprone-easily-swappable-parameters and I think Coverity has a check based on the parameter naming but those are just heuristic in nature and this would be way more targeted. This could also partially resolved using enums but only if there is a fixed range of values.

FruitClover commented 3 months ago

From my perspective, warnings don't work well here.

You might be right that warnings for all cases don't work well here, but at least some could be useful, specifically:

|     | Base Type | Derived Type | Warning | FixItHint |
|-----+-----------+--------------+---------+-----------|
|     | <alias>   | int          | +       | +         |
|     | int       | <alias>      | +       | -         |

with the intention to warn users that changes in alias names in the Base/Derive class will not be reflected in Derive/Base respectively.

And we should take care of user-defined types, e.g., for the following example from libc++ there should be no warning:

/usr/include/c++/v1/locale:2668:18: warning: Inconsistent alias name for 'std::moneypunct_byname<char>::do_neg_format' return type [bugprone-inconsistent-use-of-alias-in-declaration]
/usr/include/c++/v1/locale:2668:18: warning: derived type is alias, but base type is not
 2668 |      pattern     do_neg_format()    const override {return __neg_format_;}
      |      ~~~~~~~     ^
/usr/include/c++/v1/locale:2619:25: note: Base return type is here
 2619 |     virtual pattern     do_neg_format()    const
      |             ~~~~~~~     ^
/usr/include/c++/v1/locale:2645:34: note: Alias for derived return type: 'std::moneypunct_byname::pattern'
 2645 |     typedef money_base::pattern  pattern;
      |     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~

When the tricky situations arise, we can force users to make their intentions clear by forcing users to add some comments or extend method declarations to tell the compiler that function overriding is required, otherwise an error message will be given. It's about the entire function declaration, not just the type. It is a good idea to send a proposal to std-proposals@lists.isocpp.org.

Sorry, I don't quite get extend method declarations, you mean the case when compiler overloads another function after alias changes?

FruitClover commented 3 months ago

As C/C++ do not have "strong typedef" I have been looking forward to such a check as it could be used to enforce type safety across compatible POD types.

Yes, that's one of the initial reasons for such check, and, as @5chmidti noticed, with the intention to extend into forward declarations.