llvm / llvm-project

The LLVM Project is a collection of modular and reusable compiler and toolchain technologies.
http://llvm.org
Other
28.6k stars 11.82k forks source link

[analyzer] Serious slowdown on FFmpeg sheervideo.c when using ArrayBoundV2 #89045

Closed steakhal closed 6 months ago

steakhal commented 6 months ago

When release testing on top of clang-18, I noticed some significant (many X) slowdown on certain projects, e.g. on FFmpeg. When investigated, the most impacted TU was sheervideo.c, where we had 33 seconds in the past, but now it takes around 1.23 hours to analyze. At that point, I'd count it as a hang.

After bisecting, the blamed change was #72107 Switch to PostStmt callbacks in ArrayBoundV2.

I used perf to get a glimps on where we spend our time, and here is how it looks: image

The OOBv2 checker checks array subscripts, and then it spends most of its time on traversing symbols to see if any of it is tainted.

I've also checked that prior to this change, the maximum Sym->computeComplexity() within getTaintedSymbolsImpl was significantly lower than after the change. After the change this maximal complexity was more around the threshold (35), 30-33.

I dumped the state when the getTaintedSymbolsImpl appeared 63 times in the call stack and the state dump itself, was huge. Several if not all lines individually taking up 26 Megabytes. Many of these lines encoded the history of some "hashing-like" computation.

I'm not there yet to make suggestions for a fix, but I anyways wanted to let you know. @NagyDonat

Here is the preprocessed reproducer: sheervideo.zip

Here is the command to run it:

./build/bin/clang -cc1 -analyze -analyzer-checker=core,alpha.security -analyze-function=decode_c82p -analyzer-display-progress preprocessed_sheervideo.c
llvmbot commented 6 months ago

@llvm/issue-subscribers-clang-static-analyzer

Author: Balazs Benics (steakhal)

When release testing on top of clang-18, I noticed some significant (many X) slowdown on certain projects, e.g. on FFmpeg. When investigated, the most impacted TU was `sheervideo.c`, where we had 33 seconds in the past, but now it takes around 1.23 **hours** to analyze. At that point, I'd count it as a hang. After bisecting, the blamed change was #72107 `Switch to PostStmt callbacks in ArrayBoundV2`. I used `perf` to get a glimps on where we spend our time, and here is how it looks: ![image](https://github.com/llvm/llvm-project/assets/6280485/293b7227-b0b0-47bc-9948-f42bea93d54d) The `OOBv2` checker checks array subscripts, and then it spends most of its time on traversing symbols to see if any of it is tainted. I've also checked that prior to this change, the maximum `Sym->computeComplexity()` within `getTaintedSymbolsImpl` was significantly lower than after the change. After the change this maximal complexity was more around the threshold (35), 30-33. I dumped the state when the `getTaintedSymbolsImpl` appeared 63 times in the call stack and the state dump itself, was huge. Several if not all lines individually taking up 26 Megabytes. Many of these lines encoded the history of some "hashing-like" computation. I'm not there yet to make suggestions for a fix, but I anyways wanted to let you know. @NagyDonat Here is the preprocessed reproducer: [sheervideo.zip](https://github.com/llvm/llvm-project/files/15009479/sheervideo.zip) Here is the command to run it: ``` ./build/bin/clang -cc1 -analyze -analyzer-checker=core,alpha.security -analyze-function=decode_c82p -analyzer-display-progress preprocessed_sheervideo.c ```
steakhal commented 6 months ago

I used this patch to find the maximum symbol complexity:

diff --git a/clang/lib/StaticAnalyzer/Checkers/Taint.cpp b/clang/lib/StaticAnalyzer/Checkers/Taint.cpp
@@ -258,6 +258,12 @@ std::vector<SymbolRef> taint::getTaintedSymbolsImpl(ProgramStateRef State,
   if (!Sym)
     return TaintedSymbols;

+  static unsigned maxComplexity = 0;
+  if (unsigned comp = Sym->computeComplexity(); comp > maxComplexity) {
+    maxComplexity = comp;
+    llvm::errs() << "New max complexity: " << comp << "\n";
+  }
+
   // Traverse all the symbols this symbol depends on to see if any are tainted.
   for (SymbolRef SubSym : Sym->symbols()) {
     if (!isa<SymbolData>(SubSym))
steakhal commented 6 months ago

If I skip taint checking in OOBv2, the running times improve. For the decode_c82p entry point, previously it took 1456 ms, but with skipping the taint checking it only takes 1706 ms. It would be still 17% slower than before, but considering that this is more of an edge case, that would be somewhat acceptable.

I applied this for disabling taint checking:

diff --git a/clang/lib/StaticAnalyzer/Checkers/ArrayBoundCheckerV2.cpp b/clang/lib/StaticAnalyzer/Checkers/ArrayBoundCheckerV2.cpp
@@ -401,7 +401,7 @@ void ArrayBoundCheckerV2::performCheck(const Expr *E, CheckerContext &C) const {
         reportOOB(C, ExceedsUpperBound, OOB_Exceeds, ByteOffset, Msgs, E);
         return;
       }
-      if (isTainted(State, ByteOffset)) {
+      if (false && isTainted(State, ByteOffset)) {
         // Both cases are possible, but the offset is tainted, so report.
         std::string RegName = getRegionName(Reg);
NagyDonat commented 6 months ago

Thanks for alerting me, I'll investigate this issue. I'm not too familiar with the tainted symbol algorithm, but I roughly recall that it has bad worst-case complexity, so I'm not very surprised that it blew up.

NagyDonat commented 6 months ago

I found some exponentially inefficient recursion in getTaintedSymbolsImpl(), which is clearly capable of producing the symptoms that you observed, especially on the code of sheervideo.c which is full of complex nested ElementRegion expressions. I cannot explain why this inefficiency would be activated/triggered by the commit which changes the callbacks of ArrayBoundV2, but I think that it's too "good fit" to be a coincidence (and it should be fixed even if your observations are caused by a different bug).

Could you check whether my PR https://github.com/llvm/llvm-project/pull/89606 resolves the performance issue that you observed?

steakhal commented 6 months ago

Thanks for looking into this issue. I really appreciate that!

If we accept how we check ElementRegions for taint, and check both Base-region and the Offset, then we could memoize the results to cut recursion.

If we say, everything works as intended, but ElementRegions should be considered tainted only if the Offset is tainted, then your PR seems valid to me.

If we ask ourselves, why was the maximal symbol complexity of the symbols passed to isTainted at most 4-5 prior the change, and suddenly 31-35 (hitting the max complexity), then I'm not sure if this is expected and intended. We might end up with other consequences because of these largely complex symbols in other places, and isTaint was just an example. Depending on this, we might cure a symptom and not the cause with the PR you propose or memoization.

I'd be interested in what is the value of having so complex symbols at times, and what is actually causing those huge symbols.

NagyDonat commented 6 months ago

If we say, everything works as intended, but ElementRegions should be considered tainted only if the Offset is tainted, then your PR seems valid to me.

I think you're misunderstanding my commit -- an ElementRegion with a tainted super-region will be considered tainted even after merging my commit (because the ElementRegion is a SubRegion and the branch for subregions still performs the check once).

I'd be interested in what is the value of having so complex symbols at times, and what is actually causing those huge symbols.

I agree that this would require separate investigation. My wild guess is that there is some normalization step that happens between the PostStmt and the Location callbacks...

steakhal commented 6 months ago

I had a look at the patch and that seems solid. That should land even if we decide to continue this investigation here. Landing that would close this issue, but we can always reopen it or split a different ticket if the findings justify so.

NagyDonat commented 6 months ago

I merged my commit and created a new issue for the symbol complexity part of this issue. I'll start working on that question, probably by examining the logic that happens between the PostStmt and Location callbacks.