llvm / llvm-project

The LLVM Project is a collection of modular and reusable compiler and toolchain technologies.
http://llvm.org
Other
26.67k stars 10.92k forks source link

[clang-tidy] bugprone-use-after-move - false-negative #90174

Open PiotrZSL opened 2 months ago

PiotrZSL commented 2 months ago

Example:

#include <memory>

struct A {};

struct ClassA
{
    ClassA(std::unique_ptr<A>);
};

class ClassB
{
    ClassB(std::unique_ptr<A> aaa)
        : aa(std::move(aaa))
    {
        a = std::make_unique<ClassA>(std::move(aaa));
    }
    std::unique_ptr<A> aa;
    std::unique_ptr<ClassA> a;
};

No issue from bugprone-use-after-move or from clang-analyzer-cplusplus.Move.

Example 2:

#include <memory>

struct Test
{
    Test(std::unique_ptr<int> arg)
      : a(std::move(arg))
    {
        consume_again(std::move(arg));
    }
    void consume_again(std::unique_ptr<int>) {}
    std::unique_ptr<int> a;
};

int main() {
    std::unique_ptr<int> a = std::make_unique<int>(1);
    Test t1(std::move(a));
    Test t2(std::move(a));
}

Detected by clang-analyzer-cplusplus.Move, but not by bugprone-use-after-move

llvmbot commented 2 months ago

@llvm/issue-subscribers-clang-tidy

Author: Piotr Zegar (PiotrZSL)

Example: ``` #include <memory> struct A {}; struct ClassA { ClassA(std::unique_ptr<A>); }; class ClassB { ClassB(std::unique_ptr<A> aaa) : aa(std::move(aaa)) { a = std::make_unique<ClassA>(std::move(aaa)); } std::unique_ptr<A> aa; std::unique_ptr<ClassA> a; }; ``` No issue from bugprone-use-after-move or from clang-analyzer-cplusplus.Move. ``` Example 2: #include <iostream> #include <memory> struct Verbose { Verbose() { std::clog << __PRETTY_FUNCTION__ << '\n'; } ~Verbose() { std::clog << __PRETTY_FUNCTION__ << '\n'; } Verbose(const Verbose&) { std::clog << __PRETTY_FUNCTION__ << '\n'; } Verbose(Verbose&&) { std::clog << __PRETTY_FUNCTION__ << '\n'; } Verbose& operator=(const Verbose&) { std::clog << __PRETTY_FUNCTION__ << '\n'; return *this; } Verbose& operator=(Verbose&&) { std::clog << __PRETTY_FUNCTION__ << '\n'; return *this; } }; struct Test { Test(std::unique_ptr<int> arg) : a(std::move(arg)) { consume_again(std::move(arg)); } void consume_again(std::unique_ptr<int>) {} std::unique_ptr<int> a; }; int main() { std::unique_ptr<int> a = std::make_unique<int>(1); Test t1(std::move(a)); Test t2(std::move(a)); } ``` Detected by clang-analyzer-cplusplus.Move, but not by bugprone-use-after-move
llvmbot commented 2 months ago

@llvm/issue-subscribers-clang-static-analyzer

Author: Piotr Zegar (PiotrZSL)

Example: ``` #include <memory> struct A {}; struct ClassA { ClassA(std::unique_ptr<A>); }; class ClassB { ClassB(std::unique_ptr<A> aaa) : aa(std::move(aaa)) { a = std::make_unique<ClassA>(std::move(aaa)); } std::unique_ptr<A> aa; std::unique_ptr<ClassA> a; }; ``` No issue from bugprone-use-after-move or from clang-analyzer-cplusplus.Move. ``` Example 2: #include <iostream> #include <memory> struct Verbose { Verbose() { std::clog << __PRETTY_FUNCTION__ << '\n'; } ~Verbose() { std::clog << __PRETTY_FUNCTION__ << '\n'; } Verbose(const Verbose&) { std::clog << __PRETTY_FUNCTION__ << '\n'; } Verbose(Verbose&&) { std::clog << __PRETTY_FUNCTION__ << '\n'; } Verbose& operator=(const Verbose&) { std::clog << __PRETTY_FUNCTION__ << '\n'; return *this; } Verbose& operator=(Verbose&&) { std::clog << __PRETTY_FUNCTION__ << '\n'; return *this; } }; struct Test { Test(std::unique_ptr<int> arg) : a(std::move(arg)) { consume_again(std::move(arg)); } void consume_again(std::unique_ptr<int>) {} std::unique_ptr<int> a; }; int main() { std::unique_ptr<int> a = std::make_unique<int>(1); Test t1(std::move(a)); Test t2(std::move(a)); } ``` Detected by clang-analyzer-cplusplus.Move, but not by bugprone-use-after-move
PiotrZSL commented 2 months ago

+clang:static analyser for an clang-analyzer-cplusplus.Move case

steakhal commented 1 month ago

I can confirm that the first case is not reported by CSA, because we don't inline std::make_unique. This invalidates the argument, aka. the rvalue-reference to a, and we conservatively assume that it's no longer moved-from.

If we were inlining std::make_unique then we would have a use-after-move report inside it. We would also have the report if we would define make_unique outside of the std namespace, (because we don't inline function bodies from std). See here:

#include <memory>

struct A {};
struct SinkA {
    SinkA(std::unique_ptr<A>);
};
template <typename T, typename... Args>
auto my_make_unique(Args&&... args){
    return std::unique_ptr<T>(new T(std::forward<Args>(args)...));
}

void natively(std::unique_ptr<A> x) {
    std::unique_ptr<A> tmp = std::move(x);
    std::unique_ptr<SinkA> y2{new SinkA(std::move(x))}; // reported
}

void viaStdMakeUnique(std::unique_ptr<A> x) {
    std::unique_ptr<A> tmp = std::move(x);
    std::unique_ptr<SinkA> y2 = std::make_unique<SinkA>(std::move(x)); // not inlined, escapes...
}

void viaMyMakeUnique(std::unique_ptr<A> x) {
    std::unique_ptr<A> tmp = std::move(x);
    std::unique_ptr<SinkA> y2 = my_make_unique<SinkA>(std::move(x)); // reported
}

We should have a way to enforce certain functions to be always inlined, such as std::addressof, std::move, std::forward, std::make_unique, and some lightweight container ctors/member functions like std::span, std::string_view.

In this sense, this relates to #81734, #64584 and #94193.


In other cases, we actually want the opposite: ensure that certain functions are never inlined, like std::sort or std::ranges::sort. This is demonstrated by #78132.

PiotrZSL commented 1 month ago

@steakhal inlining is an compiler optimization, it does not exist on AST level. For me this is just bug, we got second std::move, but check is ignoring it.

steakhal commented 1 month ago

@steakhal inlining is an compiler optimization, it does not exist on AST level. For me this is just bug, we got second std::move, but check is ignoring it.

Yes, but I was refering to inlining in the static analyzer domain. Sorry for the confusion.

Wrt the second example and clang-tidy: thats outside of my domain. I dont want to comment on that.

PiotrZSL commented 3 weeks ago

Problem is that bugprone-use-after-move incorrectly designed. Check is trying to be smart and figure out if std::move actually resulted in move or not. And due to that it does not handle most of scenarios like try_emplace and this here. Where check should be simply straight forward and flag any second std::move that occur after first one.

martinboehme commented 2 weeks ago

Both of the examples in this bug use std::unique_ptr, which, unlike most other types, has a well-defined moved-from state: A std::unique_ptr that has been moved from is guaranteed to be null. It's therefore not an error to use a std::unique_ptr that has been moved from, and programmers may intentionally rely on this behavior.

The behavior of both code examples in this bug is well-defined and may well be what the programmer intended. I don't think, therefore, that we should be emitting a diagnosis on these examples.

Do you get use-after-move warnings if you substitute std::unique_ptr with a different type that doesn't have a well-defined moved-from state (e.g. std::vector)?

Check is trying to be smart and figure out if std::move actually resulted in move or not.

I'm not sure exactly what you're referring to -- can you elaborate?

The check doesn't try to figure out whether std::move() resulted in a move or not -- indeed, because the check is intra-procedural, it can't in general know this. Take this example:

void maybeConsume(std::vector<int> &&);
void f() {
  std::vector<int> v;
  maybeConsume(std::move(v));
}

Does the call to maybeConsume(std::move(v)) result in a move or not? Without seeing (and analyzing) the definition of maybeConsume(), there is no way for us to know. The check therefore simply assumes this (quoting from the documentation): "If the result of std::move is passed to an rvalue reference parameter, this will always be considered to cause a move, even if the function that consumes this parameter does not move from it, or if it does so only conditionally."

What I think you may be referring to is that the check considers the move to happen not in the std::move() itself but in whichever function the result of the std::move() is passed to (in the example above, this would be maybeConsume()). This correctly reflects C++ semantics: std::move() is just a cast to an rvalue reference, and the actual move happens in whichever function consumes that rvalue reference (ultimately, a move constructor or move assignment operator).

This distinction is important when sequencing rules come into play. Consider this example (from the lit test):

void passByRvalueReference(int i, A &&a);
void sequencingOfMoveAndUse() {
  // This case is fine because the move only happens inside
  // passByRvalueReference(). At this point, a.getInt() is guaranteed to have
  // been evaluated.
  {
    A a;
    passByRvalueReference(a.getInt(), std::move(a));
  }
  // [snip]
}

If we assumed that the move already happens in std::move(), then we would falsely conclude that there is a potential use-after-move here: C++ makes no guarantees about the order in which arguments are evaluated, so the std::move(a) might be evaluated before the a.getInt(). But in reality, there is no risk of a use-after-move here because the move only happens inside passByRvalueReference().

And due to that it does not handle most of scenarios like try_emplace and this here.

These are two separate scenarios:

Where check should be simply straight forward and flag any second std::move that occur after first one.

As explained above, this would cause false positives for std::unique_ptr (and the other standard smart pointers), where moving from the same pointer twice has defined behavior that programmers may intentionally exploit.

martinboehme commented 2 weeks ago

(Sorry for the late reply, I only saw this issue today.)

PiotrZSL commented 2 weeks ago

"moving from the same pointer twice has defined behavior that programmers may intentionally exploit."

But why would you move from same pointer twice ? You already know as a developer that after first move this pointer is null, so just instead of referencing pointer just put nullptr into code. Or re-initialize pointer. Using object after passing it as an argument to std::move, is just a potential bug. It may work in many cases, but it's a code-smell. I can understand why clang-analyser may want to focus only on clean bugs, but clang-tidy should just raise warnings for all those potential issues.

martinboehme commented 2 weeks ago

But why would you move from same pointer twice?

The specific case of moving from the same pointer twice is indeed relatively rare, though I've seen people do this in loops, where the intent is to pass the non-null pointer on the first iteration of the loop, and a null pointer on subsequent iterations. I think I've also seen cases where people conditionally move from a smart pointer, and then later move that same pointer (which now may or may not be null) to some other place.

But other uses of moved-from smart pointers are pretty common and definitely require special treatment to avoid false positives, specifically operator bool and == nullptr, which can be used to check whether a smart pointer has been (conditionally) moved from.

If you would like to forbid use-after-moves of smart pointers entirely on your codebase, then one option could be to introduce a different mode for the check (though CheckOptions) in which it treats smart pointers like all other types.

I can understand why clang-analyser may want to focus only on clean bugs, but clang-tidy should just raise warnings for all those potential issues.

clang-tidy can afford a higher false positive rate than other tools, but even here, false positives have a cost.

PiotrZSL commented 2 weeks ago

"As explained above, this would cause false positives for std::unique_ptr (and the other standard smart pointers), where moving from the same pointer twice has defined behavior that programmers may intentionally exploit."

But that is a bugprone code. After all check name is "bugprone-use-after-move", you do not know if developer re-used variable on purpose or by mistake. And if moved then they could use null instead of referencing variable again. Check shouldn't treat smart pointers in different way to other objects. Problem is same, and it doesn't matter whatever object is left in usable state or not, it's still use after move.

"But other uses of moved-from smart pointers are pretty common and definitely require special treatment to avoid false positives, specifically operator bool and == nullptr, which can be used to check whether a smart pointer has been (conditionally) moved from." No, that's a code-smell. I do not have a single usage like this in a 5 milion lines of C++17 code. And above would work only if pointer were actually not null before.

"If you would like to forbid use-after-moves of smart pointers entirely on your codebase, then one option could be to introduce a different mode for the check (though CheckOptions) in which it treats smart pointers like all other types." First check should not be "relaxed" by default, should be very strict. Second I really still do not see usecase for threading smart pointer sin different way. I could potentially add such option, but would prefer check to detect more issues by default.

I got over 30 000 usages of std::move in code, most of them on smart pointers. And now after applying fix from https://github.com/llvm/llvm-project/pull/94869, 3 new real bugs popup (luckily not in critical code), there were no false-positives related to std::move (had few related to std::forward but thats because i took llvm 19 version of check and applied to llvm 18). I can't live with 3 real bugs because for check nullptr smart pointers are fine.

martinboehme commented 2 weeks ago

Before responding to individual points, let me stress that I understand your concerns, I'm open to considering changing the behavior of the check for smart pointers, and I understand that doing so has enabled you to find real bugs. I just want to make sure that this change isn't a change for the worse in other codebases / for other users.

"As explained above, this would cause false positives for std::unique_ptr (and the other standard smart pointers), where moving from the same pointer twice has defined behavior that programmers may intentionally exploit."

But that is a bugprone code. After all check name is "bugprone-use-after-move", you do not know if developer re-used variable on purpose or by mistake. And if moved then they could use null instead of referencing variable again. Check shouldn't treat smart pointers in different way to other objects. Problem is same, and it doesn't matter whatever object is left in usable state or not, it's still use after move.

There is an important difference though: For smart pointers, the behavior is deterministic; for other types, it is essentially non-deterministic. The standard explicitly defines the behavior that smart pointers must have when moved from, so it's reasonable to expect that some programmers will take advantage of this behavior. It can be argued that doing so is bad style (many other things that are explicitly defined in C++ are widely considered bad style), but use-after-move on other types is not just bad style but essentially non-determinism.

"If you would like to forbid use-after-moves of smart pointers entirely on your codebase, then one option could be to introduce a different mode for the check (though CheckOptions) in which it treats smart pointers like all other types." First check should not be "relaxed" by default, should be very strict. Second I really still do not see usecase for threading smart pointer sin different way. I could potentially add such option, but would prefer check to detect more issues by default.

Understand, I just feel as if other users of the check might see this differently. It would be great to get some more input on this from other users of this check (though I'm not sure what the best way of doing that would be).

I got over 30 000 usages of std::move in code, most of them on smart pointers. And now after applying fix from #94869, 3 new real bugs popup (luckily not in critical code), there were no false-positives related to std::move (had few related to std::forward but thats because i took llvm 19 version of check and applied to llvm 18). I can't live with 3 real bugs because for check nullptr smart pointers are fine.

Can you share source code for these bugs? I'd be really interested to see what actual source code looks like that gets this wrong, because that usually gives a much more realistic "feel" than discussing synthetic examples. If this is a closed-source codebase, is there any way you can "anonymize" the code samples?

I've tried searching for examples where code intentionally exploits the moved-from behavior of smart pointers, and does so in a way that seems defensible, but I haven't been able to come up with anything (though it's pretty hard to search for this kind of thing). I do feel as if I saw legitimate examples many years ago when I first developed the check, and that I added the special treatment of smart pointers because of this, but I don't recall any details.

tl;dr: I would love to see some real-world examples, and let's discuss further.

PiotrZSL commented 2 weeks ago

"Can you share source code for these bugs?"

2 of 3 issues were like this (but without typedefs):

#include <memory>

struct A {};

template <typename T> struct SS : std::shared_ptr<T> {};

// using Ptr = SS<A>;
using Ptr = std::shared_ptr<A>;

struct S {
  S(Ptr s) : s(std::move(s)) {
    if (s) {
      // do something
    }
  }

  Ptr s;
};

If you swap to SS, then it's detected.

We do not have code that "intentionally exploits the moved-from behavior ". So after getting rid of custom smart pointer handling all new issues that popup were real. Thing is that we got check that adds std::move where it's appliable and in balance this check is used to make sure that there are no bugs.

martinboehme commented 4 days ago

(Sorry for the late reply -- this dropped off my radar.)

"Can you share source code for these bugs?"

2 of 3 issues were like this (but without typedefs):

[snip]

If you swap to SS, then it's detected.

Thanks -- interesting! I see how this pattern is particularly subtle because the parameter and the member variable are named the same, and it's easy to forget what the s inside the constructor body refers to, or convince yourself that it has to refer to the member variable, obviously, because otherwise the code doesn't make sense. Meanwhile, the // do something is never executed... 😱

We do not have code that "intentionally exploits the moved-from behavior ". So after getting rid of custom smart pointer handling all new issues that popup were real. Thing is that we got check that adds std::move where it's appliable and in balance this check is used to make sure that there are no bugs.

Got it.

It makes sense that you'd like to catch bugs like the above. I'm still a bit hesitant about flipping the behavior without feedback from other users. I'm not sure what the best way would be to solicit feedback -- maybe submit a short post to the clang-tidy forum describing the proposed change in behavior (with a link to this bug) and asking for feedback?