llvm / llvm-project

The LLVM Project is a collection of modular and reusable compiler and toolchain technologies.
http://llvm.org
Other
29.43k stars 12.16k forks source link

taint is not propagating across translation units with clangsa #57697

Closed tomrittervg closed 1 year ago

tomrittervg commented 2 years ago

I am running clang static analyzer via CodeChecker with CTU analysis on Firefox (so it's a big project.) My analyzer was not producing the expected result and after narrowing down the problem I eventually pinpointed it as taint (which is present in Translation Unit 1) is not propagating into Translation Unit 2.

I am using clang version 16.0.0 (https://github.com/llvm/llvm-project.git 10a7ee0bac211810376f29a879a9f73ed2ab15fc)

My narrowing efforts will output whether the variable in an if statement is tainted. A custom config I have will apply taint to the second parameter of ReadPrivilegedParam.

In PDNSRequestParent.cpp

            if ((!(IPC::ReadPrivilegedParam((&(reader__)), (&(port)))))) {
                FatalError("Error deserializing 'int32_t'");
                return MsgValueError;
            }
            if (port == 50) { // Tainted
                printf("Hi");
            }
            int newport = std::move(port);
            if (newport == 50) { // Tainted
                printf("Hi");
            }
            // ...
            mozilla::ipc::IPCResult __ok = (static_cast<DNSRequestParent*>(this))->RecvCancelDNSRequest(std::move(hostName), std::move(trrServer), std::move(newport), std::move(type), std::move(originAttributes), std::move(flags), std::move(reason));

In DNSRequestParent.cpp

mozilla::ipc::IPCResult DNSRequestParent::RecvCancelDNSRequest(
    const nsCString& hostName, const nsCString& trrServer, const int32_t& port,
    const uint16_t& type, const OriginAttributes& originAttributes,
    const uint32_t& flags, const nsresult& reason) {
  if (port == 60) { // Not tainted
    printf("hi");
  }
  mDNSRequest->OnRecvCancelDNSRequest(hostName, trrServer, port, type,
                                      originAttributes, flags, reason);
  return IPC_OK();
}

Here is the command to generate the ast for PDNSRequestParent:

/home/tom/Documents/moz/static-analysis/llvm-project/build/stage2/clang/bin/clang-16 -c -x c++ --sysroot /home/tom/.mozbuild/sysroot-x86_64-linux-gnu -std=gnu++17 -I/home/tom/Documents/moz/static-analysis/mozilla-unified/objdir/dist/stl_wrappers -I/home/tom/Documents/moz/static-analysis/mozilla-unified/objdir/dist/system_wrappers -include /home/tom/Documents/moz/static-analysis/mozilla-unified/config/gcc_hidden.h -DDEBUG=1 -DOS_POSIX=1 -DOS_LINUX=1 -DMOZ_HAS_MOZGLUE -DMOZILLA_INTERNAL_API -DIMPL_LIBXUL -DSTATIC_EXPORTABLE_JS_API -I/home/tom/Documents/moz/static-analysis/mozilla-unified/netwerk/dns -I/home/tom/Documents/moz/static-analysis/mozilla-unified/objdir/netwerk/dns -I/home/tom/Documents/moz/static-analysis/mozilla-unified/objdir/ipc/ipdl/_ipdlheaders -I/home/tom/Documents/moz/static-analysis/mozilla-unified/ipc/chromium/src -I/home/tom/Documents/moz/static-analysis/mozilla-unified/netwerk/base -I/home/tom/Documents/moz/static-analysis/mozilla-unified/netwerk/ipc -I/home/tom/Documents/moz/static-analysis/mozilla-unified/netwerk/protocol/http -I/home/tom/Documents/moz/static-analysis/mozilla-unified/objdir/dist/include -I/home/tom/Documents/moz/static-analysis/mozilla-unified/objdir/dist/include/nspr -I/home/tom/Documents/moz/static-analysis/mozilla-unified/objdir/dist/include/nss -DMOZILLA_CLIENT -include /home/tom/Documents/moz/static-analysis/mozilla-unified/objdir/mozilla-config.h -Qunused-arguments -Qunused-arguments -Wall -Wbitfield-enum-conversion -Wdeprecated-this-capture -Wempty-body -Wformat-type-confusion -Wignored-qualifiers -Wpointer-arith -Wshadow-field-in-constructor-modified -Wsign-compare -Wtype-limits -Wno-error=tautological-type-limit-compare -Wunreachable-code -Wunreachable-code-return -Wunused-but-set-parameter -Wno-invalid-offsetof -Wclass-varargs -Wempty-init-stmt -Wfloat-overflow-conversion -Wfloat-zero-conversion -Wloop-analysis -Wno-range-loop-analysis -Wc++2a-compat -Wenum-compare-conditional -Wno-ambiguous-reversed-operator -Wno-error=deprecated -Wno-error=deprecated-anon-enum-enum-conversion -Wno-error=deprecated-enum-enum-conversion -Wno-error=deprecated-enum-float-conversion -Wno-error=deprecated-pragma -Wno-error=deprecated-this-capture -Wno-error=deprecated-volatile -Wcomma -Wimplicit-fallthrough -Wstring-conversion -Wno-inline-new-delete -Wno-error=deprecated-declarations -Wno-error=array-bounds -Wno-error=backend-plugin -Wno-error=free-nonheap-object -Wno-error=return-std-move -Wno-error=atomic-alignment -Wno-error=deprecated-copy -Wno-error=deprecated-builtins -Wformat -Wformat-security -Wno-gnu-zero-variadic-macro-arguments -Wno-psabi -Wthread-safety -Wno-unknown-warning-option -fno-sized-deallocation -fno-aligned-new -fno-exceptions -fPIC -fno-rtti -fno-exceptions -fno-math-errno -pthread -pipe -gdwarf-4 -Xclang -load -Xclang /home/tom/Documents/moz/static-analysis/mozilla-unified/objdir/build/clang-plugin/libclang-plugin.so -Os -fno-omit-frame-pointer -funwind-tables -fno-strict-aliasing -ffp-contract=off -ferror-limit=0 /home/tom/Documents/moz/static-analysis/mozilla-unified/objdir/ipc/ipdl/PDNSRequestParent.cpp -emit-ast -D__clang_analyzer__ -w -o /media/tom/b032d6f1-a2e4-4c59-8c52-4cb0c18f69ce/ctu-analysis-22/reports/ctu-dir/x86_64/ast/home/tom/Documents/moz/static-analysis/mozilla-unified/objdir/ipc/ipdl/PDNSRequestParent.cpp.ast

And DNSRequestParent:

/home/tom/Documents/moz/static-analysis/llvm-project/build/stage2/clang/bin/clang-16 -c -x c++ --sysroot /home/tom/.mozbuild/sysroot-x86_64-linux-gnu -std=gnu++17 -I/home/tom/Documents/moz/static-analysis/mozilla-unified/objdir/dist/stl_wrappers -I/home/tom/Documents/moz/static-analysis/mozilla-unified/objdir/dist/system_wrappers -include /home/tom/Documents/moz/static-analysis/mozilla-unified/config/gcc_hidden.h -DDEBUG=1 -DOS_POSIX=1 -DOS_LINUX=1 -DMOZ_HAS_MOZGLUE -DMOZILLA_INTERNAL_API -DIMPL_LIBXUL -DSTATIC_EXPORTABLE_JS_API -I/home/tom/Documents/moz/static-analysis/mozilla-unified/netwerk/dns -I/home/tom/Documents/moz/static-analysis/mozilla-unified/objdir/netwerk/dns -I/home/tom/Documents/moz/static-analysis/mozilla-unified/objdir/ipc/ipdl/_ipdlheaders -I/home/tom/Documents/moz/static-analysis/mozilla-unified/ipc/chromium/src -I/home/tom/Documents/moz/static-analysis/mozilla-unified/netwerk/base -I/home/tom/Documents/moz/static-analysis/mozilla-unified/netwerk/ipc -I/home/tom/Documents/moz/static-analysis/mozilla-unified/netwerk/protocol/http -I/home/tom/Documents/moz/static-analysis/mozilla-unified/objdir/dist/include -I/home/tom/Documents/moz/static-analysis/mozilla-unified/objdir/dist/include/nspr -I/home/tom/Documents/moz/static-analysis/mozilla-unified/objdir/dist/include/nss -DMOZILLA_CLIENT -include /home/tom/Documents/moz/static-analysis/mozilla-unified/objdir/mozilla-config.h -Qunused-arguments -Qunused-arguments -Wall -Wbitfield-enum-conversion -Wdeprecated-this-capture -Wempty-body -Wformat-type-confusion -Wignored-qualifiers -Wpointer-arith -Wshadow-field-in-constructor-modified -Wsign-compare -Wtype-limits -Wno-error=tautological-type-limit-compare -Wunreachable-code -Wunreachable-code-return -Wunused-but-set-parameter -Wno-invalid-offsetof -Wclass-varargs -Wempty-init-stmt -Wfloat-overflow-conversion -Wfloat-zero-conversion -Wloop-analysis -Wno-range-loop-analysis -Wc++2a-compat -Wenum-compare-conditional -Wno-ambiguous-reversed-operator -Wno-error=deprecated -Wno-error=deprecated-anon-enum-enum-conversion -Wno-error=deprecated-enum-enum-conversion -Wno-error=deprecated-enum-float-conversion -Wno-error=deprecated-pragma -Wno-error=deprecated-this-capture -Wno-error=deprecated-volatile -Wcomma -Wimplicit-fallthrough -Wstring-conversion -Wno-inline-new-delete -Wno-error=deprecated-declarations -Wno-error=array-bounds -Wno-error=backend-plugin -Wno-error=free-nonheap-object -Wno-error=return-std-move -Wno-error=atomic-alignment -Wno-error=deprecated-copy -Wno-error=deprecated-builtins -Wformat -Wformat-security -Wno-gnu-zero-variadic-macro-arguments -Wno-psabi -Wthread-safety -Wno-unknown-warning-option -fno-sized-deallocation -fno-aligned-new -fno-exceptions -fPIC -fno-rtti -fno-exceptions -fno-math-errno -pthread -pipe -gdwarf-4 -Xclang -load -Xclang /home/tom/Documents/moz/static-analysis/mozilla-unified/objdir/build/clang-plugin/libclang-plugin.so -Os -fno-omit-frame-pointer -funwind-tables -fno-strict-aliasing -ffp-contract=off -ferror-limit=0 /home/tom/Documents/moz/static-analysis/mozilla-unified/netwerk/dns/DNSRequestParent.cpp -emit-ast -D__clang_analyzer__ -w -o /media/tom/b032d6f1-a2e4-4c59-8c52-4cb0c18f69ce/ctu-analysis-22/reports/ctu-dir/x86_64/ast/home/tom/Documents/moz/static-analysis/mozilla-unified/netwerk/dns/DNSRequestParent.cpp.ast

They both use -emit-ast. The resulting file I cannot figure out how to inspect. (I've uploaded them here though.)

If I do -emit-llvm and the llvm-dis the resulting .bc file, I can produce a file with human-readable LLVM IR though, and I've also uploaded those. However I don't really know how to read them to figure out if anything in those produced files is wrong.

I'd like to keep debugging this myself but I am at a lost for how to move forward in analyzing the IR or AST. An alternate approach would be to start adding printfs to Taint.cpp or GenericTaintChecker.cpp and see if I can watch the taint not propagate and see if that sheds any light on things.

The command I use to analyze DNSRequestParent.cpp (and that should show port as tainted, but does not) is

/home/tom/Documents/moz/static-analysis/llvm-project/build/stage2/clang/bin/clang-16 --analyze -Qunused-arguments -Xclang -load -Xclang /home/tom/Documents/moz/static-analysis/mozilla-unified/objdir/build/clang-plugin/libclang-plugin.so -Xclang -analyzer-opt-analyze-headers -Xclang -analyzer-output=plist-multi-file -o /media/tom/b032d6f1-a2e4-4c59-8c52-4cb0c18f69ce/ctu-analysis-22/reports/DNSRequestParent.cpp_clangsa_a31416b9aa8ea1be5595e4506920c5e8.plist -Xclang -analyzer-config -Xclang expand-macros=true -Xclang -analyzer-config -Xclang alpha.security.taint.TaintPropagation:Config=/media/tom/b032d6f1-a2e4-4c59-8c52-4cb0c18f69ce/ctu-analysis-21/myconfig.yaml -Xclang -analyzer-checker=alpha.security.cert.env.InvalidPtr,alpha.security.cert.pos.34c,alpha.security.taint.TaintPropagation,core.CallAndMessage,core.DivideZero,core.NonNullParamChecker,core.NullDereference,core.StackAddressEscape,core.UndefinedBinaryOperatorResult,core.VLASize,core.uninitialized.ArraySubscript,core.uninitialized.Assign,core.uninitialized.Branch,core.uninitialized.CapturedBlockVariable,core.uninitialized.UndefReturn,cplusplus.InnerPointer,cplusplus.Move,cplusplus.NewDelete,cplusplus.NewDeleteLeaks,cplusplus.PlacementNew,cplusplus.PureVirtualCall,cplusplus.StringChecker,deadcode.DeadStores,example.EnumIfElse,fuchsia.HandleChecker,nullability.NullPassedToNonnull,nullability.NullReturnedFromNonnull,nullability.NullableDereferenced,nullability.NullablePassedToNonnull,nullability.NullableReturnedFromNonnull,optin.cplusplus.UninitializedObject,optin.cplusplus.VirtualCall,optin.mpi.MPI-Checker,optin.osx.OSObjectCStyleCast,optin.osx.cocoa.localizability.EmptyLocalizationContextChecker,optin.osx.cocoa.localizability.NonLocalizedStringChecker,optin.performance.GCDAntipattern,optin.performance.Padding,optin.portability.UnixAPI,osx.API,osx.MIG,osx.NumberObjectConversion,osx.OSObjectRetainCount,osx.ObjCProperty,osx.SecKeychainAPI,osx.cocoa.AtSync,osx.cocoa.AutoreleaseWrite,osx.cocoa.ClassRelease,osx.cocoa.Dealloc,osx.cocoa.IncompatibleMethodTypes,osx.cocoa.Loops,osx.cocoa.MissingSuperCall,osx.cocoa.NSAutoreleasePool,osx.cocoa.NSError,osx.cocoa.NilArg,osx.cocoa.NonNilReturnValue,osx.cocoa.ObjCGenerics,osx.cocoa.RetainCount,osx.cocoa.RunLoopAutoreleaseLeak,osx.cocoa.SelfInit,osx.cocoa.SuperDealloc,osx.cocoa.UnusedIvars,osx.cocoa.VariadicMethodTypes,osx.coreFoundation.CFError,osx.coreFoundation.CFNumber,osx.coreFoundation.CFRetainRelease,osx.coreFoundation.containers.OutOfBounds,osx.coreFoundation.containers.PointerSizedValues,security.FloatLoopCounter,security.insecureAPI.DeprecatedOrUnsafeBufferHandling,security.insecureAPI.UncheckedReturn,security.insecureAPI.bcmp,security.insecureAPI.bcopy,security.insecureAPI.bzero,security.insecureAPI.decodeValueOfObjCType,security.insecureAPI.getpw,security.insecureAPI.gets,security.insecureAPI.mkstemp,security.insecureAPI.mktemp,security.insecureAPI.rand,security.insecureAPI.strcpy,security.insecureAPI.vfork,unix.API,unix.Malloc,unix.MallocSizeof,unix.MismatchedDeallocator,unix.Vfork,unix.cstring.BadSizeArg,unix.cstring.NullArg,valist.CopyToSelf,valist.Uninitialized,valist.Unterminated -Xclang -analyzer-config -Xclang aggressive-binary-operation-simplification=true -Xclang -analyzer-config -Xclang experimental-enable-naive-ctu-analysis=true -Xclang -analyzer-config -Xclang ctu-dir=/media/tom/b032d6f1-a2e4-4c59-8c52-4cb0c18f69ce/ctu-analysis-22/reports/ctu-dir/x86_64 -Xclang -analyzer-config -Xclang display-ctu-progress=true -x c++ --sysroot /home/tom/.mozbuild/sysroot-x86_64-linux-gnu -std=gnu++17 -I/home/tom/Documents/moz/static-analysis/mozilla-unified/objdir/dist/stl_wrappers -I/home/tom/Documents/moz/static-analysis/mozilla-unified/objdir/dist/system_wrappers -include /home/tom/Documents/moz/static-analysis/mozilla-unified/config/gcc_hidden.h -DDEBUG=1 -DOS_POSIX=1 -DOS_LINUX=1 -DMOZ_HAS_MOZGLUE -DMOZILLA_INTERNAL_API -DIMPL_LIBXUL -DSTATIC_EXPORTABLE_JS_API -I/home/tom/Documents/moz/static-analysis/mozilla-unified/netwerk/dns -I/home/tom/Documents/moz/static-analysis/mozilla-unified/objdir/netwerk/dns -I/home/tom/Documents/moz/static-analysis/mozilla-unified/objdir/ipc/ipdl/_ipdlheaders -I/home/tom/Documents/moz/static-analysis/mozilla-unified/ipc/chromium/src -I/home/tom/Documents/moz/static-analysis/mozilla-unified/netwerk/base -I/home/tom/Documents/moz/static-analysis/mozilla-unified/netwerk/ipc -I/home/tom/Documents/moz/static-analysis/mozilla-unified/netwerk/protocol/http -I/home/tom/Documents/moz/static-analysis/mozilla-unified/objdir/dist/include -I/home/tom/Documents/moz/static-analysis/mozilla-unified/objdir/dist/include/nspr -I/home/tom/Documents/moz/static-analysis/mozilla-unified/objdir/dist/include/nss -DMOZILLA_CLIENT -include /home/tom/Documents/moz/static-analysis/mozilla-unified/objdir/mozilla-config.h -Qunused-arguments -Qunused-arguments -Wall -Wbitfield-enum-conversion -Wdeprecated-this-capture -Wempty-body -Wformat-type-confusion -Wignored-qualifiers -Wpointer-arith -Wshadow-field-in-constructor-modified -Wsign-compare -Wtype-limits -Wno-error=tautological-type-limit-compare -Wunreachable-code -Wunreachable-code-return -Wunused-but-set-parameter -Wno-invalid-offsetof -Wclass-varargs -Wempty-init-stmt -Wfloat-overflow-conversion -Wfloat-zero-conversion -Wloop-analysis -Wno-range-loop-analysis -Wc++2a-compat -Wenum-compare-conditional -Wno-ambiguous-reversed-operator -Wno-error=deprecated -Wno-error=deprecated-anon-enum-enum-conversion -Wno-error=deprecated-enum-enum-conversion -Wno-error=deprecated-enum-float-conversion -Wno-error=deprecated-pragma -Wno-error=deprecated-this-capture -Wno-error=deprecated-volatile -Wcomma -Wimplicit-fallthrough -Wstring-conversion -Wno-inline-new-delete -Wno-error=deprecated-declarations -Wno-error=array-bounds -Wno-error=backend-plugin -Wno-error=free-nonheap-object -Wno-error=return-std-move -Wno-error=atomic-alignment -Wno-error=deprecated-copy -Wno-error=deprecated-builtins -Wformat -Wformat-security -Wno-gnu-zero-variadic-macro-arguments -Wno-psabi -Wthread-safety -Wno-unknown-warning-option -fno-sized-deallocation -fno-aligned-new -fno-exceptions -fPIC -fno-rtti -fno-exceptions -fno-math-errno -pthread -pipe -gdwarf-4 -Xclang -load -Xclang /home/tom/Documents/moz/static-analysis/mozilla-unified/objdir/build/clang-plugin/libclang-plugin.so -Os -fno-omit-frame-pointer -funwind-tables -fno-strict-aliasing -ffp-contract=off -ferror-limit=0 /home/tom/Documents/moz/static-analysis/mozilla-unified/netwerk/dns/DNSRequestParent.cpp
llvmbot commented 2 years ago

@llvm/issue-subscribers-clang-static-analyzer

steakhal commented 2 years ago

Okay, thanks for the report. You did a really good job of investigating the issue yourself. Unfortunately, I don't have the time to do it myself, but I'm sort of familiar with the ASTImporter, which does the heavy lifting of parsing and merging the remote TU AST entities to the primary TU's AST. That being said, the fact that the AST of the remote function is transparent to the Clang Static Analyzer, hence to the taint flow as well. So, I'm not sure if this is gonna be the issue here.

Luckily, I have something to do with the taint analysis with CSA as well, but by looking at the example code it's not apparent to me how we lose taint.

Here is how I usually track down such issues: 1) Preprocess both TUs, then try that the preprocessed file is still affected and the issue is present. 2) Get familiar with creduce, we will use this to automatically reduce the preprocessed file to a minimal reproducer. 3) Create an //interestingness test// script. This usually looks something like this for CTU reductions:

#!/bin/bash
set -e

# generate ast for the remote TU, I usually append the -pedantic flag just to make sure the rescued code still does not have compilation errors...
/path/to/clang ----flags  -o remote.ast

# generate extdefmap for the definitions within the remote TU, such as function definitions which we can 'load' from the primary TU.
/path/to/extdefmap -----flags > externalDefMap.txt

# replace file path/extensions or whatever within that file ((you probably want relative path for the remote AST dump)
set ..... -i externalDefMap.txt

# invoke the analyzer with the prepared 'mapping' and also remember to use the current directory as the CTU dir, so the relative file mapping would find it.
/path/to/clang -----flags ----ctu-dir=. ----extdefmapping=mapping.txt 2>&1 | grep -Pz "my fancy output"

# let's make sure that the previous clang version had a different behavior!
! /path/to/other/clang ----same-flags---- 2>&1 | grep -Pz "my fancy output"

4) Run the reduction: creduce --n $(nproc) ./my-interestingness-test.sh remote.cpp (( for the primary TU reduction, you don't need to each time dump the remote TU ast, because that's not gonna change in that case, hence you could adjust the script accordingly. )) This will cut out all stuff that was not necessary for the observed behavior.


For inspiration about the CTU related flags, I recommend checking out some of the test files here: clang/test/Analysis/ctu-*.c* You can find invocations for the extdefmapping too by gripping for the %clang_extdef_map in the repo.

steakhal commented 2 years ago

One more thing. You can use the debug.ExprInspection checkers for investigating. Have a look at clang/test/Analysis/debug-exprinspection-istainted.c. (This is fine grained, if you are only interested in a single/handful of case(s)). Read more about them at https://clang.llvm.org/docs/analyzer/developer-docs/DebugChecks.html. Also check out the debug.TaintTest checker, which reports all lines where some tainted value was observed (this is quite verbose).

balazs-benics-sonarsource commented 2 years ago

For me, it seems to work. At least with this example:

/// Primary TU
void clang_analyzer_eval(int);
void clang_analyzer_isTainted(int);
int scanf(const char *format, ...);
int f(int); // defined in the other TU
void top() {
  int n;
  scanf("%d", &n);
  clang_analyzer_isTainted(n); // yes
  int v = f(n);
  clang_analyzer_isTainted(v); // yes
}
///----------------------------------

/// Other TU
int scanf(const char *format, ...);
void clang_analyzer_isTainted(int);
int f(int i) {
  clang_analyzer_isTainted(i); // yes
  int m;
  scanf("%d", &m);
  clang_analyzer_isTainted(m); // yes
  return m + i;
}

Given that besides enabling CTU, I also enable the alpha.security.taint and debug.ExprInspection checkers.

tomrittervg commented 2 years ago

Okay, I've minimized things pretty well.

DNSRequestParent.cpp

#include "DNSRequestParent.h"

void clang_analyzer_isTainted(int);

bool DNSRequestParent::RecvCancelDNSRequest(const int& port) {

  clang_analyzer_isTainted(port);

  if (port == 50) {
    fprintf(stderr, "Hi");
  }

  DNSResolverType  a = (DNSResolverType )port;
  if(a == DNSResolverType::Native) {
    printf("Native");
  } else if(a == DNSResolverType::TRR) {
    printf("TRR");
  }
  return true;

}

DNSRequestParent.h

#include "PDNSRequestParent.h"

enum class DNSResolverType : int { Native = 0, TRR, ODoH };

class DNSRequestParent : public PDNSRequestParent {
 public:
  friend class PDNSRequestParent;

 private:

  bool RecvCancelDNSRequest(
      const int& port);
};

PDNSRequestParent.cpp

#include "PDNSRequestParent.h"
#include "DNSRequestParent.h"

void clang_analyzer_isTainted(int);

void PDNSRequestParent::OnMessageReceived(int reader__)
{
            int port{};

            if ((!(IPC::ReadPrivilegedParam((&(reader__)), (&(port)))))) {
                return;
            }
            clang_analyzer_isTainted(port);
            if (port == 50) {
                printf("Hi");
            }
            int newport = std::move(port);
            clang_analyzer_isTainted(newport);
            if (newport == 50) {
                printf("Hi");
            }
            bool __ok = (static_cast<DNSRequestParent*>(this))->RecvCancelDNSRequest(std::move(newport));

}

PDNSRequestParent.h

#include "stdio.h"
#include <utility>

#ifndef PDNSRequestParent_h
#define PDNSRequestParent_h

namespace IPC {
    bool ReadPrivilegedParam(int* reader, int* value);
}

class PDNSRequestParent 
{
public:
    void OnMessageReceived(int reader__);

};

#endif // ifndef PDNSRequestParent_h

myconfig.yaml

Filters:

Propagations:
  - Name: ReadPrivilegedParam
    DstArgs: [1]

  - Name: move
    SrcArgs: [0]
    DstArgs: [-1]

Sinks:
  - Name: fprintf
    Args: [2]

Generate ast

/home/tom/Documents/moz/static-analysis/llvm-project/build/stage2/clang/bin/clang-16  \
-c -x c++ --sysroot /home/tom/.mozbuild/sysroot-x86_64-linux-gnu -std=gnu++17 \
DNSRequestParent.cpp -emit-ast -D__clang_analyzer__ -w -o DNSRequestParent.cpp.ast

/home/tom/Documents/moz/static-analysis/llvm-project/build/stage2/clang/bin/clang-16  \
-c -x c++ --sysroot /home/tom/.mozbuild/sysroot-x86_64-linux-gnu -std=gnu++17 \
PDNSRequestParent.cpp -emit-ast -D__clang_analyzer__ -w -o PDNSRequestParent.cpp.ast

Generate externalDefMap

/home/tom/Documents/moz/static-analysis/llvm-project/build/stage2/clang/bin/clang-extdef-mapping -p . DNSRequestParent.cpp > 1
/home/tom/Documents/moz/static-analysis/llvm-project/build/stage2/clang/bin/clang-extdef-mapping -p . PDNSRequestParent.cpp > 2
cat 1 2 > externalDefMap.txt
cat externalDefMap.txt

49:c:@S@DNSRequestParent@F@RecvCancelDNSRequest#&1I# /media/tom/b032d6f1-a2e4-4c59-8c52-4cb0c18f69ce/ctu-analysis-22/reports/cleanroom/DNSRequestParent.cpp
45:c:@S@PDNSRequestParent@F@OnMessageReceived#I# /media/tom/b032d6f1-a2e4-4c59-8c52-4cb0c18f69ce/ctu-analysis-22/reports/cleanroom/PDNSRequestParent.cpp

Analyze

/home/tom/Documents/moz/static-analysis/llvm-project/build/stage2/clang/bin/clang-16 \
--analyze  \
-Xclang -analyzer-config \
-Xclang alpha.security.taint.TaintPropagation:Config=myconfig.yaml \
-Xclang -analyzer-checker=alpha.security.taint.TaintPropagation,debug.ExprInspection,debug.TaintTest \
-Xclang -analyzer-config \
-Xclang experimental-enable-naive-ctu-analysis=true \
-Xclang -analyzer-config \
-Xclang ctu-dir=. \
-fno-exceptions \
PDNSRequestParent.cpp

PDNSRequestParent.cpp:20:13: warning: YES [debug.ExprInspection]
            clang_analyzer_isTainted(port);
            ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
PDNSRequestParent.cpp:20:38: warning: tainted [debug.TaintTest]
            clang_analyzer_isTainted(port);
                                     ^~~~
PDNSRequestParent.cpp:21:17: warning: tainted [debug.TaintTest]
            if (port == 50) {
                ^~~~
PDNSRequestParent.cpp:21:22: warning: tainted [debug.TaintTest]
            if (port == 50) {
                ~~~~~^~~~~
PDNSRequestParent.cpp:24:27: warning: tainted [debug.TaintTest]
            int newport = std::move(port);
                          ^~~~~~~~~~~~~~~
PDNSRequestParent.cpp:25:13: warning: NO [debug.ExprInspection]
            clang_analyzer_isTainted(newport);
            ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
PDNSRequestParent.cpp:25:13: warning: YES [debug.ExprInspection]
            clang_analyzer_isTainted(newport);
            ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
PDNSRequestParent.cpp:25:38: warning: tainted [debug.TaintTest]
            clang_analyzer_isTainted(newport);
                                     ^~~~~~~
PDNSRequestParent.cpp:26:17: warning: tainted [debug.TaintTest]
            if (newport == 50) {
                ^~~~~~~
PDNSRequestParent.cpp:26:25: warning: tainted [debug.TaintTest]
            if (newport == 50) {
                ~~~~~~~~^~~~~
PDNSRequestParent.cpp:29:18: warning: Value stored to '__ok' during its initialization is never read [deadcode.DeadStores]
            bool __ok = (static_cast<DNSRequestParent*>(this))->RecvCancelDNSRequest(std::move(newport));
                 ^~~~   ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
11 warnings generated.

Note in the output above that there is both a NO and a YES for line 25...

/home/tom/Documents/moz/static-analysis/llvm-project/build/stage2/clang/bin/clang-16 \
--analyze \
-Xclang -analyzer-output=plist-multi-file \
-Xclang -analyzer-config \
-Xclang alpha.security.taint.TaintPropagation:Config=myconfig.yaml \
-Xclang -analyzer-checker=alpha.security.taint.TaintPropagation,debug.ExprInspection,debug.TaintTest  \
-Xclang -analyzer-config \
-Xclang experimental-enable-naive-ctu-analysis=true \
-Xclang -analyzer-config \
-Xclang ctu-dir=. \
-fno-exceptions \
DNSRequestParent.cpp

DNSRequestParent.cpp:13:3: warning: NO [debug.ExprInspection]
  clang_analyzer_isTainted(port);
  ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1 warning generated.

I'm going to keep fiddling, but want to get this down on paper as a checkpoint.

tomrittervg commented 2 years ago

@balazs-benics-sonarsource

I'm doing something wrong because I actually cannot get your example to work:

export CC=/home/tom/Documents/moz/static-analysis/llvm-project/build/stage2/clang/bin/clang-16
export EXT=/home/tom/Documents/moz/static-analysis/llvm-project/build/stage2/clang/bin/clang-extdef-mapping

#CC=clang-12
#EXT=clang-extdef-mapping-12

compiledb make

$CC  \
-c -x c++ --sysroot /home/tom/.mozbuild/sysroot-x86_64-linux-gnu -std=gnu++17 \
1.cpp -emit-ast -D__clang_analyzer__ -w -o 1.cpp.ast

$CC  \
-c -x c++ --sysroot /home/tom/.mozbuild/sysroot-x86_64-linux-gnu -std=gnu++17 \
2.cpp -emit-ast -D__clang_analyzer__ -w -o 2.cpp.ast

$EXT -p . 1.cpp > 1
$EXT -p . 2.cpp > 2
cat 1 2 > externalDefMap.txt
rm 1 2

sed -i'' 's/\.cpp$/\.cpp\.ast/g' externalDefMap.txt
sed -i -e "s|$(pwd)/||g" externalDefMap.txt

$CC \
--analyze \
-Xclang -analyzer-checker=alpha.security.taint,alpha.security.taint.TaintPropagation,debug.ExprInspection,debug.TaintTest  \
-Xclang -analyzer-config \
-Xclang experimental-enable-naive-ctu-analysis=true \
-Xclang -analyzer-config \
-Xclang ctu-dir=. \
1.cpp

$CC \
--analyze \
-Xclang -analyzer-checker=alpha.security.taint,alpha.security.taint.TaintPropagation,debug.ExprInspection,debug.TaintTest  \
-Xclang -analyzer-config \
-Xclang experimental-enable-naive-ctu-analysis=true \
-Xclang -analyzer-config \
-Xclang ctu-dir=. \
2.cpp

Where 1.cpp has top and 2.cpp has f. I get the following:

1.cpp:9:3: warning: YES [debug.ExprInspection]
  clang_analyzer_isTainted(n); // yes
  ^~~~~~~~~~~~~~~~~~~~~~~~~~~
1.cpp:9:28: warning: tainted [debug.TaintTest]
  clang_analyzer_isTainted(n); // yes
                           ^
1.cpp:10:13: warning: tainted [debug.TaintTest]
  int v = f(n);
            ^
1.cpp:11:3: warning: NO [debug.ExprInspection]
  clang_analyzer_isTainted(v); // yes
  ^~~~~~~~~~~~~~~~~~~~~~~~~~~
4 warnings generated.
2.cpp:4:3: warning: NO [debug.ExprInspection]
  clang_analyzer_isTainted(i); // yes
  ^~~~~~~~~~~~~~~~~~~~~~~~~~~
2.cpp:7:3: warning: YES [debug.ExprInspection]
  clang_analyzer_isTainted(m); // yes
  ^~~~~~~~~~~~~~~~~~~~~~~~~~~
2.cpp:7:28: warning: tainted [debug.TaintTest]
  clang_analyzer_isTainted(m); // yes
                           ^
2.cpp:8:10: warning: tainted [debug.TaintTest]
  return m + i;
         ^
2.cpp:8:12: warning: tainted [debug.TaintTest]
  return m + i;
         ~~^~~

The important one is clang_analyzer_isTainted(i); // yes which is not tainted.

steakhal commented 2 years ago

I actually modified a CTU test case in the test suite. Anyway, that's not that important. I suspect you are using CodeChecker or familiar with using it anyway. Here is an example that will log your example, and execute the analysis; fire up a CodeChecker server to which also stores the run. Then it will pop a firefox tab (I see you are from Mozilla ;) ) to see the reports. Look for the DNSRequestParent file and the report for line 8. There will be one YES. This proves that taint analysis works for CTU. taint-test.tar.gz

About the commands achieving this, well, look at the console output bc. I set the CodeChecker verbosity level to debug_analyzer which should reveal them.

Sorry that I'm writing only now, I'm really busy lately.

tomrittervg commented 2 years ago

So after some painful argument-by-argument testing and comparison I eventually figured out the crucial difference between your reproducer and mine.

When generating the ast files, mine used -std=gnu++17 and yours uses -std=gnu++14. If I change yours to 17, it stops working, and if I change mine to 14, it starts working. I don't know why this is, I'll have to dig into it next week. I don't know the subtleties of the difference between C++14 and C++17 and if there's something there that significantly changes things.

steakhal commented 2 years ago

Thats weird. It shouldnt affect this specific case. It would worth investigating the exploded graph. (The internal datastructure descibing how the abstract machine was simulated across different execution paths.) maybe @gamesh411 @dkrupp @whisperity are more interested in this as they are involved with CTU and taint analysis as well. Unfortunately, I dont think I can invest into this issue in the next months.

dkrupp commented 2 years ago

I will try to find out why using -std=gnu++17 makes a difference. @tomrittervg for debugging this.

whisperity commented 2 years ago

When generating the ast files, mine used -std=gnu++17 and yours uses -std=gnu++14. If I change yours to 17, it stops working, and if I change mine to 14, it starts working. I don't know why this is, I'll have to dig into it next week. I don't know the subtleties of the difference between C++14 and C++17 and if there's something there that significantly changes things.

There are a few things such as temporary materialisation rules that changed between C++14 and C++17, but I have compared the two set of flags on Godbolt and the relevant part (the call with the static-cast) seems completely identical.

In fact, if I render the diff, there are very few changes:

--- <gnu++14>
+++ <gnu++17>
@@ -36,8 +36,9 @@
 | |   | `-ReturnStmt <col:7, col:36>
 | |   |   `-CXXStaticCastExpr <col:14, col:36> '_Tp' xvalue static_cast<_Tp &&> <Dependent>
 | |   |     `-DeclRefExpr <col:33> 'typename std::remove_reference<_Tp>::type' lvalue ParmVar 0xPOINTER '__t' 'typename std::remove_reference<_Tp>::type &'
+| |   |-WarnUnusedResultAttr </opt/compiler-explorer/gcc-12.2.0/lib/gcc/x86_64-linux-gnu/12.2.0/../../../../include/c++/12.2.0/x86_64-linux-gnu/bits/c++config.h:141:31> nodiscard ""
 | |   |-BuiltinAttr Implicit 1185
-| |   |-NoThrowAttr <line:77:5> Implicit
+| |   |-NoThrowAttr </opt/compiler-explorer/gcc-12.2.0/lib/gcc/x86_64-linux-gnu/12.2.0/../../../../include/c++/12.2.0/bits/move.h:77:5> Implicit
 | |   `-ConstAttr <col:5> Implicit
 | |-FunctionTemplateDecl <line:86:3, line:94:5> line:89:5 forward
 | | |-TemplateTypeParmDecl <line:86:12, col:21> col:21 referenced typename depth 0 index 0 _Tp
@@ -52,8 +53,9 @@
 | |   | `-ReturnStmt <line:93:7, col:36>
 | |   |   `-CXXStaticCastExpr <col:14, col:36> '_Tp' xvalue static_cast<_Tp &&> <Dependent>
 | |   |     `-DeclRefExpr <col:33> 'typename std::remove_reference<_Tp>::type' lvalue ParmVar 0xPOINTER '__t' 'typename std::remove_reference<_Tp>::type &&'
+| |   |-WarnUnusedResultAttr </opt/compiler-explorer/gcc-12.2.0/lib/gcc/x86_64-linux-gnu/12.2.0/../../../../include/c++/12.2.0/x86_64-linux-gnu/bits/c++config.h:141:31> nodiscard ""
 | |   |-BuiltinAttr Implicit 1185
-| |   |-NoThrowAttr <line:89:5> Implicit
+| |   |-NoThrowAttr </opt/compiler-explorer/gcc-12.2.0/lib/gcc/x86_64-linux-gnu/12.2.0/../../../../include/c++/12.2.0/bits/move.h:89:5> Implicit
 | |   `-ConstAttr <col:5> Implicit
 | |-FunctionTemplateDecl <line:101:3, line:105:77> line:104:5 move
 | | |-TemplateTypeParmDecl <line:101:12, col:21> col:21 referenced typename depth 0 index 0 _Tp
@@ -63,21 +65,23 @@
 | | | | `-ReturnStmt <col:7, col:74>
 | | | |   `-CXXStaticCastExpr <col:14, col:74> 'typename std::remove_reference<_Tp>::type' xvalue static_cast<typename std::remove_reference<_Tp>::type &&> <Dependent>
 | | | |     `-DeclRefExpr <col:71> '_Tp' lvalue ParmVar 0xPOINTER '__t' '_Tp &&'
+| | | |-WarnUnusedResultAttr </opt/compiler-explorer/gcc-12.2.0/lib/gcc/x86_64-linux-gnu/12.2.0/../../../../include/c++/12.2.0/x86_64-linux-gnu/bits/c++config.h:141:31> nodiscard ""
 | | | |-BuiltinAttr Implicit 1186
-| | | |-NoThrowAttr <line:104:5> Implicit
+| | | |-NoThrowAttr </opt/compiler-explorer/gcc-12.2.0/lib/gcc/x86_64-linux-gnu/12.2.0/../../../../include/c++/12.2.0/bits/move.h:104:5> Implicit
 | | | `-ConstAttr <col:5> Implicit
 | | `-FunctionDecl <line:103:5, line:105:77> line:104:5 used constexpr move 'typename std::remove_reference<int &>::type &&(int &) noexcept' implicit-inline
 | |   |-TemplateArgument type 'int &'
 | |   | `-LValueReferenceType 'int &'
 | |   |   `-BuiltinType 'int'
 | |   |-ParmVarDecl <col:10, col:16> col:16 __t 'int &'
+| |   |-WarnUnusedResultAttr </opt/compiler-explorer/gcc-12.2.0/lib/gcc/x86_64-linux-gnu/12.2.0/../../../../include/c++/12.2.0/x86_64-linux-gnu/bits/c++config.h:141:31> nodiscard ""
 | |   |-BuiltinAttr Implicit 1186
-| |   |-NoThrowAttr <col:5> Implicit
+| |   |-NoThrowAttr </opt/compiler-explorer/gcc-12.2.0/lib/gcc/x86_64-linux-gnu/12.2.0/../../../../include/c++/12.2.0/bits/move.h:104:5> Implicit
 | |   `-ConstAttr <col:5> Implicit
 | |-ClassTemplateDecl <line:108:3, line:111:57> line:109:12 __move_if_noexcept_cond
 | | |-TemplateTypeParmDecl <line:108:12, col:21> col:21 referenced typename depth 0 index 0 _Tp
 | | `-CXXRecordDecl <line:109:5, line:111:57> line:109:12 struct __move_if_noexcept_cond definition
-| |   |-DefinitionData empty standard_layout trivially_copyable trivial literal has_constexpr_non_copy_move_ctor can_const_default_init
+| |   |-DefinitionData empty aggregate standard_layout trivially_copyable trivial literal has_constexpr_non_copy_move_ctor can_const_default_init
 | |   | |-DefaultConstructor exists trivial constexpr needs_implicit defaulted_is_constexpr
 | |   | |-CopyConstructor simple trivial has_const_param needs_implicit implicit_has_const_param
 | |   | |-MoveConstructor exists simple trivial needs_implicit
@@ -95,20 +99,22 @@
 | |   |   `-CallExpr <col:14, col:27> '<dependent type>'
 | |   |     |-UnresolvedLookupExpr <col:14, col:19> '<overloaded function type>' lvalue (no ADL) = 'move' 0xPOINTER
 | |   |     `-DeclRefExpr <col:24> '_Tp' lvalue ParmVar 0xPOINTER '__x' '_Tp &'
+| |   |-WarnUnusedResultAttr </opt/compiler-explorer/gcc-12.2.0/lib/gcc/x86_64-linux-gnu/12.2.0/../../../../include/c++/12.2.0/x86_64-linux-gnu/bits/c++config.h:141:31> nodiscard ""
 | |   |-BuiltinAttr Implicit 1187
-| |   |-NoThrowAttr <line:125:5> Implicit
+| |   |-NoThrowAttr </opt/compiler-explorer/gcc-12.2.0/lib/gcc/x86_64-linux-gnu/12.2.0/../../../../include/c++/12.2.0/bits/move.h:125:5> Implicit
 | |   `-ConstAttr <col:5> Implicit
 | |-FunctionTemplateDecl <line:142:3, line:146:37> line:145:5 addressof
 | | |-TemplateTypeParmDecl <line:142:12, col:21> col:21 referenced typename depth 0 index 0 _Tp
-| | `-FunctionDecl <line:144:5, line:146:37> line:145:5 addressof '_Tp *(_Tp &) noexcept' inline
+| | `-FunctionDecl <line:144:5, line:146:37> line:145:5 constexpr addressof '_Tp *(_Tp &) noexcept' inline
 | |   |-ParmVarDecl <col:15, col:20> col:20 referenced __r '_Tp &'
 | |   |-CompoundStmt <line:146:5, col:37>
 | |   | `-ReturnStmt <col:7, col:34>
 | |   |   `-CallExpr <col:14, col:34> '<dependent type>'
 | |   |     |-UnresolvedLookupExpr <col:14, col:19> '<overloaded function type>' lvalue (no ADL) = '__addressof' 0xPOINTER
 | |   |     `-DeclRefExpr <col:31> '_Tp' lvalue ParmVar 0xPOINTER '__r' '_Tp &'
+| |   |-WarnUnusedResultAttr </opt/compiler-explorer/gcc-12.2.0/lib/gcc/x86_64-linux-gnu/12.2.0/../../../../include/c++/12.2.0/x86_64-linux-gnu/bits/c++config.h:141:31> nodiscard ""
 | |   |-BuiltinAttr Implicit 1182
-| |   |-NoThrowAttr <line:145:5> Implicit
+| |   |-NoThrowAttr </opt/compiler-explorer/gcc-12.2.0/lib/gcc/x86_64-linux-gnu/12.2.0/../../../../include/c++/12.2.0/bits/move.h:145:5> Implicit
 | |   `-ConstAttr <col:5> Implicit
 | |-FunctionTemplateDecl <line:150:3, line:151:37> col:16 addressof
 | | |-TemplateTypeParmDecl <line:150:12, col:21> col:21 referenced typename depth 0 index 0 _Tp
@@ -220,7 +226,7 @@
 | |-EnumConstantDecl <col:48> col:48 referenced TRR 'DNSResolverType'
 | `-EnumConstantDecl <col:53> col:53 ODoH 'DNSResolverType'
 |-CXXRecordDecl <line:21:1, line:26:1> line:21:7 referenced class DNSRequestParent definition
-| |-DefinitionData pass_in_registers empty standard_layout trivially_copyable trivial literal has_constexpr_non_copy_move_ctor can_const_default_init
+| |-DefinitionData pass_in_registers empty aggregate standard_layout trivially_copyable trivial literal has_constexpr_non_copy_move_ctor can_const_default_init
 | | |-DefaultConstructor exists trivial constexpr needs_implicit defaulted_is_constexpr
 | | |-CopyConstructor simple trivial has_const_param needs_implicit implicit_has_const_param
 | | |-MoveConstructor exists simple trivial needs_implicit
tomrittervg commented 1 year ago

@dkrupp Were you able to dig into this at all? I am not quite sure on how to investigate this myself, but if you don't have the bandwidth but could give me some suggestions for how to go about it I may be able to stumble through it.

tomrittervg commented 1 year ago

I re-ran this with clang 17 and it's possible the behavior switched (so -14 exhibits the broken behavior and -17 exhibits the correct behavior). That seems unlikely and probably user error, but I started my steps over and replicated it, so I'm going to try building clang from source on tip and seeing what the behavior is.

tomrittervg commented 1 year ago

Using a clang revision of 4 days ago, I was able to reproduce that -17 now works and -14 does not. https://github.com/llvm/llvm-project/commit/cad708b9a1ecbf5645706056bb7c4fc0ea4721b6

tomrittervg commented 1 year ago

I don't think this issue is useful anymore, whatever the bug I was encountering was fixed at some point. I'm still working on getting taint tracking working in Firefox, the latest bug I encountered I minified and filed in #62663