llvm / llvm-project

The LLVM Project is a collection of modular and reusable compiler and toolchain technologies.
http://llvm.org
Other
27.92k stars 11.52k forks source link

clang-tidy issue:WRAPPER_ESCAPE problem. about INNERPOINTERCHECKER.cpp #105458

Open zxyzxy1111 opened 3 weeks ago

zxyzxy1111 commented 3 weeks ago

Question: clang-tidy issue: It is known that the following code snippet will cause a WRAPPER_ESCAPE problem.

const char *single_quote(const char *input) {
    std::string s = std::string("");
    return s.append("'").append(input).append("'").c_str();
}
  1. Found that the checkPostCall function can detect append and call the markPtrSymbolsReleased function, but after entering markPtrSymbolsReleased, std::cout<<"Marking symbol as released: " << Symbol << "\n"; does not output when the program is running, this line has not been executed, indicating that the problem may be in if (const PtrSet PS = State->get(MR)) { const Expr Origin = Call.getOriginExpr();

So: The problem may be that the PtrSet returned by State->get(MR) is empty, or that MR is not correctly mapped to RawPtrMap. Further check the state of RawPtrMap and the value of MR.

  1. Check the contents of RawPtrMap: Add debugging information in markPtrSymbolsReleased to print the contents of RawPtrMap and ensure that MR is correctly mapped. Check the value of MR: Print the value of MR in checkPostCall and markPtrSymbolsReleased to ensure that they are consistent.

  2. Add debugging information in markPtrSymbolsReleased: Print the contents of RawPtrMap and the value of MR. void InnerPointerChecker::markPtrSymbolsReleased(const CallEvent &Call, ProgramStateRef State, const MemRegion *MR, CheckerContext &C) const { std::cout << "markPtrSymbolsReleased called with MR: " << MR << "\n"; const RawPtrMapTy &RPM = State->get(); for (const auto &Entry : RPM) { std::cout << "RawPtrMap entry: " << Entry.first << "\n"; }

    if (const PtrSet PS = State->get(MR)) { const Expr Origin = Call.getOriginExpr(); for (const auto Symbol : *PS) { std::cout << "Marking symbol as released: " << Symbol << "\n"; State = allocation_state::markReleased(State, Symbol, Origin); } State = State->remove(MR); C.addTransition(State); return; } else { std::cout << "No PtrSet found for MR: " << MR << "\n"; } }

  3. Add debugging information in checkPostCall: Print the value of ObjRegion. void InnerPointerChecker::checkPostCall(const CallEvent &Call, CheckerContext &C) const { ProgramStateRef State = C.getState();

    const TypedValueRegion *ObjRegion = nullptr;

    if (const auto *ICall = dyn_cast(&Call)) { ObjRegion = dyn_cast_or_null( ICall->getCXXThisVal().getAsRegion());

    if (isInvalidatingMemberFunction(Call)) { std::cout << "call markPtrSymbolsReleased with ObjRegion: " << ObjRegion << "\n"; markPtrSymbolsReleased(Call, State, ObjRegion, C); return; } }

    if (isInnerPointerAccessFunction(Call)) { if (isa(Call)) { ObjRegion = dyn_cast_or_null(Call.getArgSVal(0).getAsRegion()); }

    if (!ObjRegion) return;

    SVal RawPtr = Call.getReturnValue(); if (SymbolRef Sym = RawPtr.getAsSymbol(/IncludeBaseRegions=/true)) { PtrSet::Factory &F = State->getStateManager().get_context(); const PtrSet SetPtr = State->get(ObjRegion); PtrSet Set = SetPtr ? SetPtr : F.getEmptySet(); assert(C.wasInlined || !Set.contains(Sym)); Set = F.add(Set, Sym);

    State = State->set(ObjRegion, Set); C.addTransition(State); }

    return; }

    checkFunctionArguments(Call, State, C); }

Output call markPtrSymbolsReleased with ObjRegion: 0x7631a60 call markPtrSymbolsReleased markPtrSymbolsReleased called with MR: 0x7631a60 No PtrSet found for MR: 0x7631a60 call markPtrSymbolsReleased with ObjRegion: 0x7630cb8 call markPtrSymbolsReleased markPtrSymbolsReleased called with MR: 0x7630cb8 No PtrSet found for MR: 0x7630cb8 call markPtrSymbolsReleased with ObjRegion: 0 call markPtrSymbolsReleased markPtrSymbolsReleased called with MR: 0 No PtrSet found for MR: 0 call markPtrSymbolsReleased with ObjRegion: 0 call mark markPtrSymbolsReleased

markPtrSymbolsReleased function was correctly called, but the PtrSet returned by State->get(MR) is always empty. This indicates that MR has not been correctly added to RawPtrMap.

  1. Further debugging found that in the checkPostCall function, ObjRegion was not correctly added to RawPtrMap. In SymbolRef Sym = RawPtr.getAsSymbol(/IncludeBaseRegions=/true) returns false, which means that RawPtr is not a symbol. Further check the type and source of RawPtr to determine why it is not a symbol.

  2. The problem may be in if (isInnerPointerAccessFunction(Call)) { std::cout << "isInnerPointerAccessFunction(Call) = true" << "\n"; // It has been determined that this is running if (isa(Call)) { std::cout << "isa(Call) = true" << "\n"; // It has been determined that this is not running ObjRegion = dyn_cast_or_null(Call.getArgSVal(0).getAsRegion()); } where if (isa(Call))=false

isa(Call) returns false, need to further check the type of Call to determine why it is not a SimpleFunctionCall type.

  1. Check the specific type of Call: Use llvm::isa and llvm::dyn_cast to check the specific type of Call and print relevant information.

Output release-install/bin/wecheck test2.cpp -- -g markPtrSymbolsReleased called with MR: 0x7df7710 No PtrSet found for MR: 0x7df7710 markPtrSymbolsReleased called with MR: 0x7df6968 No PtrSet found for MR: 0x7df6968 markPtrSymbolsReleased called with MR: 0 No PtrSet found for MR: 0 markPtrSymbolsReleased called with MR: 0 No PtrSet found for MR: 0 isInnerPointerAccessFunction(Call) = true isa<CXXMemberCall>(Call) = true

Conclusion Call object does not have a getCXXThisVal method. The value of this pointer needs to be obtained from the CXXMemberCall object. // Check the specific type of Call if (llvm::isa<SimpleFunctionCall>(Call)) { std::cout << "isa<SimpleFunctionCall>(Call) = true" << "\n"; // It has been determined that this is not running ObjRegion = dyn_cast_or_null<TypedValueRegion>(Call.getArgSVal(0).getAsRegion()); } else if (const auto *MemberCall = llvm::dyn_cast<CXXMemberCall>(&Call)) { std::cout << "isa<CXXMemberCall>(Call) = true" << "\n"; ObjRegion = dyn_cast_or_null<TypedValueRegion>(MemberCall->getCXXThisVal().getAsRegion()); // It has been determined that this is a CXXMemberCall, not a SimpleFunctionCall } else if (llvm::isa<CXXConstructorCall>(Call)) { std::cout << "isa<CXXConstructorCall>(Call) = true" << "\n"; } else if (llvm::isa<CXXDestructorCall>(Call)) { std::cout << "isa<CXXDestructorCall>(Call) = true" << "\n"; } else { std::cout << "Call is of unknown type" << "\n"; } if (!ObjRegion) return;

Output isInnerPointerAccessFunction(Call) = true isa<CXXMemberCall>(Call) = true MemberCall->getCXXThisVal() = &SymRegion{conj_$13{basic_string<char> &, LC2, S1185017, #1}}ThisVal.getAsRegion() = 0x7e96d48

MemberCall->getCXXThisVal() returned a SymRegion, and ThisVal.getAsRegion() returned a valid memory region address 0x7e96d48. This indicates that the value returned by getCXXThisVal() is valid. However, dyn_cast_or_null(ThisRegion) may have failed because ThisRegion is not of TypedValueRegion type. Further check the actual type of ThisRegion and ensure that it can be correctly converted.

isInnerPointerAccessFunction(Call) = true isa(Call) = true MemberCall->getCXXThisVal() = &SymRegion{conj_$13{basic_string &, LC2, S1185017, #1}}ThisVal.getAsRegion() = 0x80fe488 ThisRegion is a SymbolicRegion ObjRegion is null

This Region is a SymbolicRegion, not a TypedValueRegion. This indicates that further processing of SymbolicRegion is needed to obtain the required TypedValueRegion.

In the checkPostCall method, when ThisRegion is a SymbolicRegion, obtain its BaseRegion and try to convert it to TypedValueRegion. After experimentation: error propagation - it has been determined that this is a SymbolicRegion, not a TypedValueRegion, and it is not possible to obtain TypedValueRegion from it. @NagyDonat

llvmbot commented 3 weeks ago

@llvm/issue-subscribers-clang-static-analyzer

Author: zuoxueying (zxyzxy1111)

Question: clang-tidy issue: It is known that the following code snippet will cause a WRAPPER_ESCAPE problem. ```cpp const char *single_quote(const char *input) { std::string s = std::string(""); return s.append("'").append(input).append("'").c_str(); } ``` 1. Found that the checkPostCall function can detect append and call the markPtrSymbolsReleased function, but after entering markPtrSymbolsReleased, std::cout<<"Marking symbol as released: " << Symbol << "\n"; does not output when the program is running, this line has not been executed, indicating that the problem may be in if (const PtrSet *PS = State->get<RawPtrMap>(MR)) { const Expr *Origin = Call.getOriginExpr(); So: The problem may be that the PtrSet returned by State->get<RawPtrMap>(MR) is empty, or that MR is not correctly mapped to RawPtrMap. Further check the state of RawPtrMap and the value of MR. 2. Check the contents of RawPtrMap: Add debugging information in markPtrSymbolsReleased to print the contents of RawPtrMap and ensure that MR is correctly mapped. Check the value of MR: Print the value of MR in checkPostCall and markPtrSymbolsReleased to ensure that they are consistent. 1. Add debugging information in markPtrSymbolsReleased: Print the contents of RawPtrMap and the value of MR. void InnerPointerChecker::markPtrSymbolsReleased(const CallEvent &Call, ProgramStateRef State, const MemRegion *MR, CheckerContext &C) const { std::cout << "markPtrSymbolsReleased called with MR: " << MR << "\n"; const RawPtrMapTy &RPM = State->get<RawPtrMap>(); for (const auto &Entry : RPM) { std::cout << "RawPtrMap entry: " << Entry.first << "\n"; } if (const PtrSet *PS = State->get<RawPtrMap>(MR)) { const Expr *Origin = Call.getOriginExpr(); for (const auto Symbol : *PS) { std::cout << "Marking symbol as released: " << Symbol << "\n"; State = allocation_state::markReleased(State, Symbol, Origin); } State = State->remove<RawPtrMap>(MR); C.addTransition(State); return; } else { std::cout << "No PtrSet found for MR: " << MR << "\n"; } } 2. Add debugging information in checkPostCall: Print the value of ObjRegion. void InnerPointerChecker::checkPostCall(const CallEvent &Call, CheckerContext &C) const { ProgramStateRef State = C.getState(); const TypedValueRegion *ObjRegion = nullptr; if (const auto *ICall = dyn_cast<CXXInstanceCall>(&Call)) { ObjRegion = dyn_cast_or_null<TypedValueRegion>( ICall->getCXXThisVal().getAsRegion()); if (isInvalidatingMemberFunction(Call)) { std::cout << "call markPtrSymbolsReleased with ObjRegion: " << ObjRegion << "\n"; markPtrSymbolsReleased(Call, State, ObjRegion, C); return; } } if (isInnerPointerAccessFunction(Call)) { if (isa<SimpleFunctionCall>(Call)) { ObjRegion = dyn_cast_or_null<TypedValueRegion>(Call.getArgSVal(0).getAsRegion()); } if (!ObjRegion) return; SVal RawPtr = Call.getReturnValue(); if (SymbolRef Sym = RawPtr.getAsSymbol(/*IncludeBaseRegions=*/true)) { PtrSet::Factory &F = State->getStateManager().get_context<PtrSet>(); const PtrSet *SetPtr = State->get<RawPtrMap>(ObjRegion); PtrSet Set = SetPtr ? *SetPtr : F.getEmptySet(); assert(C.wasInlined || !Set.contains(Sym)); Set = F.add(Set, Sym); State = State->set<RawPtrMap>(ObjRegion, Set); C.addTransition(State); } return; } checkFunctionArguments(Call, State, C); } Output call markPtrSymbolsReleased with ObjRegion: 0x7631a60 call markPtrSymbolsReleased markPtrSymbolsReleased called with MR: 0x7631a60 No PtrSet found for MR: 0x7631a60 call markPtrSymbolsReleased with ObjRegion: 0x7630cb8 call markPtrSymbolsReleased markPtrSymbolsReleased called with MR: 0x7630cb8 No PtrSet found for MR: 0x7630cb8 call markPtrSymbolsReleased with ObjRegion: 0 call markPtrSymbolsReleased markPtrSymbolsReleased called with MR: 0 No PtrSet found for MR: 0 call markPtrSymbolsReleased with ObjRegion: 0 call mark markPtrSymbolsReleased markPtrSymbolsReleased function was correctly called, but the PtrSet returned by State->get<RawPtrMap>(MR) is always empty. This indicates that MR has not been correctly added to RawPtrMap. 3. Further debugging found that in the checkPostCall function, ObjRegion was not correctly added to RawPtrMap. In SymbolRef Sym = RawPtr.getAsSymbol(/*IncludeBaseRegions=*/true) returns false, which means that RawPtr is not a symbol. Further check the type and source of RawPtr to determine why it is not a symbol. 4. The problem may be in if (isInnerPointerAccessFunction(Call)) { std::cout << "isInnerPointerAccessFunction(Call) = true" << "\n"; // It has been determined that this is running if (isa<SimpleFunctionCall>(Call)) { std::cout << "isa<SimpleFunctionCall>(Call) = true" << "\n"; // It has been determined that this is not running ObjRegion = dyn_cast_or_null<TypedValueRegion>(Call.getArgSVal(0).getAsRegion()); } where if (isa<SimpleFunctionCall>(Call))=false isa<SimpleFunctionCall>(Call) returns false, need to further check the type of Call to determine why it is not a SimpleFunctionCall type. 5. Check the specific type of Call: Use llvm::isa and llvm::dyn_cast to check the specific type of Call and print relevant information. `Output release-install/bin/wecheck test2.cpp -- -g markPtrSymbolsReleased called with MR: 0x7df7710 No PtrSet found for MR: 0x7df7710 markPtrSymbolsReleased called with MR: 0x7df6968 No PtrSet found for MR: 0x7df6968 markPtrSymbolsReleased called with MR: 0 No PtrSet found for MR: 0 markPtrSymbolsReleased called with MR: 0 No PtrSet found for MR: 0 isInnerPointerAccessFunction(Call) = true isa<CXXMemberCall>(Call) = true` Conclusion Call object does not have a getCXXThisVal method. The value of this pointer needs to be obtained from the CXXMemberCall object. `// Check the specific type of Call if (llvm::isa<SimpleFunctionCall>(Call)) { std::cout << "isa<SimpleFunctionCall>(Call) = true" << "\n"; // It has been determined that this is not running ObjRegion = dyn_cast_or_null<TypedValueRegion>(Call.getArgSVal(0).getAsRegion()); } else if (const auto *MemberCall = llvm::dyn_cast<CXXMemberCall>(&Call)) { std::cout << "isa<CXXMemberCall>(Call) = true" << "\n"; ObjRegion = dyn_cast_or_null<TypedValueRegion>(MemberCall->getCXXThisVal().getAsRegion()); // It has been determined that this is a CXXMemberCall, not a SimpleFunctionCall } else if (llvm::isa<CXXConstructorCall>(Call)) { std::cout << "isa<CXXConstructorCall>(Call) = true" << "\n"; } else if (llvm::isa<CXXDestructorCall>(Call)) { std::cout << "isa<CXXDestructorCall>(Call) = true" << "\n"; } else { std::cout << "Call is of unknown type" << "\n"; } if (!ObjRegion) return;` Output `isInnerPointerAccessFunction(Call) = true isa<CXXMemberCall>(Call) = true MemberCall->getCXXThisVal() = &SymRegion{conj_$13{basic_string<char> &, LC2, S1185017, #1}}ThisVal.getAsRegion() = 0x7e96d48` MemberCall->getCXXThisVal() returned a SymRegion, and ThisVal.getAsRegion() returned a valid memory region address 0x7e96d48. This indicates that the value returned by getCXXThisVal() is valid. However, dyn_cast_or_null<TypedValueRegion>(ThisRegion) may have failed because ThisRegion is not of TypedValueRegion type. Further check the actual type of ThisRegion and ensure that it can be correctly converted. isInnerPointerAccessFunction(Call) = true isa<CXXMemberCall>(Call) = true MemberCall->getCXXThisVal() = &SymRegion{conj_$13{basic_string<char> &, LC2, S1185017, #1}}ThisVal.getAsRegion() = 0x80fe488 ThisRegion is a SymbolicRegion ObjRegion is null This Region is a SymbolicRegion, not a TypedValueRegion. This indicates that further processing of SymbolicRegion is needed to obtain the required TypedValueRegion. In the checkPostCall method, when ThisRegion is a SymbolicRegion, obtain its BaseRegion and try to convert it to TypedValueRegion. After experimentation: error propagation - it has been determined that this is a SymbolicRegion, not a TypedValueRegion, and it is not possible to obtain TypedValueRegion from it. @NagyDonat
NagyDonat commented 3 weeks ago

Thanks for reporting this issue!

I replaced the clang-tidy label with "clang:static analyzer" because the clang-tidy check clang-analyzer-cplusplus-InnerPointer is a wrapper around a feature of the clang static analyzer.

I'm pretty sure that in the example code

const char *single_quote(const char *input) {
    std::string s = std::string("");
    return s.append("'").append(input).append("'").c_str();
}

the analyzer fails to produce a warning because it does not "know" that std::string::append always returns *this.

In general, when the clang static analyzer encounters a function call (where C++ methods are also included in "function"s), three things may happen:

  1. For some well-known and important functions there are checkers which recognize them (with evalCall callbacks) and update the State to reflect the effects of the function call.
  2. If the function is not recognized, but its source code is available within the same translation unit (and it isn't too large), then we "inline it" and jump to analyzing its source code. (There are a few exceptions, e.g. there are parts of the STL that are never inlined, because they are heavily optimized, so they would confuse the analyzer and cause false positives.)
  3. If the function is not recognized and not inlined, then we do a "conservative evaluation":
    • we discard knowledge which might have been invalidated by the call (e.g. when we call an unknown opaque function, we invalidate our knowledge about the globals because the function might have overwritten any of them);
    • we create a so called conjured symbol which represents "the unknown value returned by this particular function call".

In your example code the method std::string::append() is not recognized and not inlined, so it will be handled by a conservative evaluation and its result will be a completely unknown std::string lvalue (which is represented by the SymbolicRegion that you observed).

The analyzer does not report that the c_str() of this unknown std::string object is returned from single_quote(), because it cannot prove that this is an error: if the object is unknown, then its lifetime might be long enough to survive after single_quote().

To fix this deficiency, we'd need to introduce an evalCall callback which recognizes std::basic_string::append and updates the analyzer state to model that this method always returns *this.

This is a fairly straightforward task, but we'd probably need to think about technical questions like "which checker should handle this callback?" (In theory we have StringChecker.cpp which claims that it handles the modeling of C++ std::string methods, but I'm not familiar with it and it only contains code for modeling some constructor calls, so it's definitely unfinished.)

I (or somebody else) will probably start working on this soon.

NagyDonat commented 3 weeks ago

I checked StringChecker.cpp and despite its very general name and description, it's a trivial stub that only checks one very particular precondition (that the char* passed to the constructor of std::string may not be a nullpointer).

@haoNoQ @steakhal @Xazax-hun (or anybody else with an opinion): What do you think, what would be a good place for evalCall callbacks for std::string::append() (and similar string methods)? Should I just drop them in InnerPointerChecker?

zxyzxy1111 commented 3 weeks ago

Thank you for your detailed response. Recently, through my analysis, I found that there are several missed reports in innerpointerchecker.cpp, such as:

Example 1: image

MachineStorage is a local variable, defined inside the code block of if (const MDNode *N = dyn_cast(MD)). When the code block ends, the lifetime of MachineStorage also ends, and the local variable is destroyed. Under the condition of if (!WriterCtx.Machine), MachineStorage is assigned a new SlotTracker object, and its pointer is assigned to WriterCtx.Machine. WriterCtx.Machine points to the SlotTracker object managed by MachineStorage. When the WriteAsOperandInternal function returns, the lifetime of MachineStorage ends, and the SlotTracker object managed by MachineStorage is destroyed. Since WriterCtx.Machine points to the object managed by MachineStorage, after MachineStorage is destroyed, WriterCtx.Machine points to an invalid memory address.

Example 2: image

Line number: 204 Loop jump back: Jumping back to the beginning of the loop.

Line number: 205 Loop start: Jumped back to the beginning of the loop.

Line number: 185 Condition check: Condition "it != e", taking true branch.

Line number: 185 Condition check: Condition "currentAdaptor", taking false branch.

Line number: 187 Else branch: Reached else branch.

Line number: 198 Main error: Using invalidated internal representation of "it". (In the else branch, the code attempts to use the invalidated it, which leads to a use after free error.)

passes is a container of type std::vector<std::unique_ptr>. This means that each element in passes is a std::unique_ptr. it is an iterator pointing to elements in the passes container. Therefore, the type of it is std::vector<std::unique_ptr>::iterator. it->get() returns the raw pointer managed by std::unique_ptr, i.e., Pass*. When it->reset() is called, the element pointed to by it is set to empty, and it becomes a dangling pointer.

I think this issue leading to a dangling pointer should be a WRAPPER_ESCAPE problem, but the analyzer didn't report it. Do you think there is a problem with my analysis? If there is no problem, where should we make modifications?

zxyzxy1111 commented 3 weeks ago

@NagyDonat

zxyzxy1111 commented 3 weeks ago

Thank you once again for your thorough and patient response. Your method has proven to be very effective. Below is the eval function I added. Could you please take a look and see if it is correct?

image

After adding an eval function, the checker can correctly report errors. However, After my analysis,maybe there are still many other dangling pointers that cause the checker to miss some errors, leading to false negatives. Maybe we need to thoroughly resolve this 。 I look forward to your feedback and suggestions. @NagyDonat

zxyzxy1111 commented 3 weeks ago
class TestClass {
public:
    TestClass(unsigned int width, unsigned int height)
        : _width(width), _height(height) {
        _rowPtrs.resize(1, std::vector<const char*>(height, nullptr));
    }

    void process() {
        int tmpHalfBufferElements = _width * _height;
        std::vector<unsigned short> tmpHalfBuffer(tmpHalfBufferElements);

        char* tmpHalfBufferPtr = (char*)&tmpHalfBuffer[0];

        for (int y = 0; y < _height; ++y) {
            _rowPtrs[0][y] = (const char*)tmpHalfBufferPtr;
            tmpHalfBufferPtr += _width * sizeof(unsigned short);
        }
    }

    void printRowPtrs() const {
        for (const auto& ptr : _rowPtrs[0]) {
            std::cout << (void*)ptr << " ";
        }
        std::cout << std::endl;
    }

private:
    unsigned int _width;
    unsigned int _height;
    std::vector<std::vector<const char*>> _rowPtrs;
};

int main() {
    unsigned int width = 2;
    unsigned int height = 2;

    TestClass test(width, height);
    test.process();
    test.printRowPtrs();

    std::cout << "After process function exits:" << std::endl;
    test.printRowPtrs();

    return 0;
}

ANALYZED BY GDB

Breakpoint 1, TestClass::process (this=0x7fffffffcc00) at test1-o.cpp:12
12              int tmpHalfBufferElements = _width * _height;
(gdb) next
13              std::vector<unsigned short> tmpHalfBuffer(tmpHalfBufferElements);
(gdb) next
15              char* tmpHalfBufferPtr = (char*)&tmpHalfBuffer[0];
(gdb) n
17              for (int y = 0; y < _height; ++y) {
(gdb) n
18                  _rowPtrs[0][y] = (const char*)tmpHalfBufferPtr;
(gdb) n
19                  tmpHalfBufferPtr += _width * sizeof(unsigned short);
(gdb) n
17              for (int y = 0; y < _height; ++y) {
(gdb) n
18                  _rowPtrs[0][y] = (const char*)tmpHalfBufferPtr;
(gdb) n
19                  tmpHalfBufferPtr += _width * sizeof(unsigned short);
(gdb) n
17              for (int y = 0; y < _height; ++y) {
(gdb) n
13              std::vector<unsigned short> tmpHalfBuffer(tmpHalfBufferElements);
(gdb) n
21          }
(gdb) n
main () at test1-o.cpp:42
42          test.printRowPtrs();
(gdb) n
0x41beb0 0x41beb4 
45          std::cout << "After process function exits:" << std::endl;
(gdb) n
After process function exits:
46          test.printRowPtrs();
(gdb) n
0x41beb0 0x41beb4 
48          return 0;
(gdb) n
40          TestClass test(width, height);
(gdb) n
49      }
(gdb) n
0x00007ffff74bdd85 in __libc_start_main () from /lib64/libc.so.6
(gdb) n
Single stepping until exit from function __libc_start_main,
which has no line number information.  

From the debugging session, it can be seen that _rowPtrs is correctly set to the address of tmpHalfBuffer during the execution of the process function. However, after the process function exits, tmpHalfBuffer is destroyed, causing _rowPtrs to still reference the memory area that has been destroyed.

In the main function, the first time printRowPtrs is called, the content of _rowPtrs is a valid address (for example, 0x41beb0 and 0x41beb4). But after the process function exits, tmpHalfBuffer is destroyed, causing _rowPtrs to still reference the memory area that has been destroyed. However, the WRAPPER_ESCAPE issue is not reported.

NagyDonat commented 1 week ago

This InnerPointerChecker::evalCall looks good at first glance, if you create a pull request containing it, we would be happy to review and then merge it. Patches that improve the analyzer are always [1] welcome :smile:

[1] if they follow the coding standards and don't introduce too many false positives


After my analysis,maybe there are still many other dangling pointers that cause the checker to miss some errors, leading to false negatives. Maybe we need to thoroughly resolve this.

The static analyzer does not try to report all errors, because it will never have a complete understanding of the analyzed source files (and there are simply too many language elements, library functions, design patterns, bug types etc.). Instead, we have two more modest goals:

(Disclaimer: this is just my personal impression, not an official goal statement.)

This means that from our point of view, false positives are bugs in the analyzer, but false negatives are only equivalent to feature requests -- it is nice to turn a false negative into a true positive, but we don't need to do it. There are lots of false negatives, so we are only working on those that are easy to resolve and/or especially common and important for the users.

Although the false negatives presented in this ticket are all connected to InnerPointerChecker, it seems that they all have different root causes, so each of them would require individual solutions and there is no reason to handle them as a single project. (The checker which creates the bug report is just the final step after lots of modeling steps where the analyzer engine and the various checkers try to gather and update information about the execution state. Many false positives or negatives are caused by limitations of those earlier steps.)

steakhal commented 1 week ago

@haoNoQ @steakhal @Xazax-hun (or anybody else with an opinion): What do you think, what would be a good place for evalCall callbacks for std::string::append() (and similar string methods)? Should I just drop them in InnerPointerChecker?

To me, it would make sense to have the evalcall for append in the StringChecker - at least that was my idea at the time I introduced it.