cmu-sei / pharos

Automated static analysis tools for binary programs
Other
1.56k stars 192 forks source link

Concurrency problems in fn2hash and other tools #267

Open Trass3r opened 7 months ago

Trass3r commented 7 months ago
ooanalyzer --rose-version
ROSE 0.11.145.18
0.00940s OPTI[INFO ]: OOAnalyzer version 1.0.
424.93947s OPTI[INFO ]: ROSE stock partitioning took 424.871 seconds.
425.11544s OPTI[INFO ]: Partitioned 454959 bytes, 128311 instructions, 28323 basic blocks, 159 data blocks and 1198 functions.
497.39959s OPTI[INFO ]: Function partitioning took 497.338 seconds.
629.64410s OPTI[INFO ]: Writing serialized data took 132.245 seconds.
630.82476s OPTI[INFO ]: Partitioned 529165 bytes, 146847 instructions, 33180 basic blocks, 1443 data blocks and 1635 functions.
633.08253s OOAN[WARN ]: Successor 0x00465EFC of 465EFA: add       cl, ch not found.
634.58115s APID[WARN ]: API database has no data for DLL: DDRAW
634.58130s OOAN[WARN ]: No stack delta information for: DDRAW.dll:DirectDrawCreate
634.58179s APID[WARN ]: API database has no data for DLL: DSOUND
634.58191s OOAN[WARN ]: No stack delta information for: DSOUND.dll:DirectSoundCreate
MATCHER parse error: line 1: syntax error at 
MATCHER parse error: line 1: syntax error at =SgAsmMemoryReferenceExpression
MATCHER parse error: line 1: syntax error at SgAsmMemoryReferenceExpression
ooanalyzer: /usr/include/boost/thread/pthread/mutex.hpp:56: boost::mutex::~mutex(): Assertion `!posix::pthread_mutex_destroy(&m)' failed.
ooanalyzer: /root/pharos/scripts/rose/src/util/Sawyer/Message.C:1277: size_t Sawyer::Message::Stream::decrementRefCount(): Assertion `nrefs_ > 0' failed.

After re-running it with the serialized data I get ERROR 1: Lexical error! : <=>

Trass3r commented 7 months ago

Similar results on another executable:

2473.32372s OPTI[INFO ]: Writing serialized data took 382.011 seconds.
2478.50494s OPTI[INFO ]: Partitioned 2527920 bytes, 739384 instructions, 139875 basic blocks, 9361 data blocks and 12233 functions.
2484.35401s OOAN[WARN ]: No fallthru edge for call at 0x00498CFE
2484.45160s OOAN[WARN ]: Successor 0x004E925A of 4E9255: add       fs:[ecx+0], ah not found.
2484.84855s OOAN[WARN ]: Successor 0x0065306C of 653064: imul      si, [eax+ebp*2+0x73], 0x6740 not found.
2484.84864s OOAN[WARN ]: Successor 0x00653074 of 65306D: imul      esp, [esi+0x66], 0x72657473 not found.
2484.84871s OOAN[WARN ]: Successor 0x0065307D of 65307C: outsd      not found.
2484.84877s OOAN[WARN ]: Successor 0x006530A2 of 65309F: movq      mm4, [esi] not found.
2484.84884s OOAN[WARN ]: Successor 0x006530EE of 6530EB: add       [eax], 0 not found.
2486.85903s OOAN[ERROR]: Unable to find fallthru edge for call at 0x00498CFE
2492.35198s APID[WARN ]: API database has no data for DLL: DDRAW
2492.35205s OOAN[WARN ]: No stack delta information for: DDRAW.dll:DirectDrawCreate
2492.35209s OOAN[WARN ]: No stack delta information for: DDRAW.dll:DirectDrawEnumerateA
2492.35224s APID[WARN ]: API database has no data for DLL: DINPUT
2492.35229s OOAN[WARN ]: No stack delta information for: DINPUT.dll:DirectInputCreateA
2492.35241s APID[WARN ]: API database has no data for DLL: DSOUND
2492.35246s OOAN[WARN ]: No stack delta information for: DSOUND.dll:1
MATCHER parse error: line 1: syntax error at SgAsmMemoryReferenceExpression
MATCHER parse error: line 1: syntax error at $EXP
MATCHER parse error: line 1: syntax error at 
ERROR 1: Lexical error! : <>
MATCHER parse error: line 1: syntax error at SgAsmMemoryReferenceExpression
ERROR 1: Lexical error! : <>
2508.74145s [INFO ]: Pharos main error: (boost::wrapexcept<boost::lock_error>) boost: mutex lock failed in pthread_mutex_lock: Invalid argument
sei-eschwartz commented 7 months ago

Well, this is a new one.

Are you using the docker container or your own build?

What command line are you using?

And does it happen on every executable, or just some?

sei-mwd commented 7 months ago

In libpharos/funcs.cpp:1216, we're using a ROSE AstMatching expression to match an SgNode expression. This is what is failing. My initial perusal of ROSE's AstMatching code finds that it is matching Sg identifiers with the result of the compiler's std::typeinfo::name() output, and it is expecting a very specific format that g++ has used over the years. (By standard, the result of std::typeinfo::name() is unspecified.)

@Trass3r , may I ask what compiler (type and version) you used to compile ROSE?

We should probably not rely on ROSE's AstMatching code in its current incarnation, since it is not robust with respect to compiler changes.

sei-mwd commented 7 months ago

This is an untested modification to ROSE that might solve the problem. I'd want to recreate the problem to test it before submitting it upstream.

2 files changed, 4 insertions(+), 14 deletions(-)
src/midend/astMatching/AstTerm.C        |  6 +-----
src/midend/astMatching/MatchOperation.C | 12 +++---------

modified   src/midend/astMatching/AstTerm.C
@@ -19,11 +19,7 @@ std::string AstTerm::nodeTypeName(SgNode* node) {
   if(node==0) {
     return "null";
   } else {
-    std::string tid=typeid(*node).name();
-    int j=0;
-    while(tid[j]>='0' && tid[j]<='9') j++;
-    tid=tid.substr(j,tid.size()-j);
-    return tid;
+    return node->class_name();
   }
 }

modified   src/midend/astMatching/MatchOperation.C
@@ -132,13 +132,7 @@ MatchOpVariableAssignment::performOperation(MatchStatus&  status, RoseAst::itera
   return true;
 }

-MatchOpCheckNode::MatchOpCheckNode(std::string nodename) {
-  // convert name to same format as typeid provides;
-  std::stringstream ss;
-  ss << nodename.size();
-  ss << nodename;
-  _nodename=ss.str();
-}
+MatchOpCheckNode::MatchOpCheckNode(std::string nodename) : _nodename{nodename} {}

 std::string
 MatchOpCheckNode::toString() {
@@ -154,7 +148,7 @@ MatchOpCheckNode::performOperation(MatchStatus&  status, RoseAst::iterator& i, S
   SgNode* node=*i;
   if(node!=0) {
     // determine type name of node
-    std::string nodeTypeName=typeid(*node).name();
+    std::string nodeTypeName= node->class_name();
     if(status.debug)
       std::cout << "(patternnode " << _nodename << ":" << nodeTypeName <<")";
     return nodeTypeName==_nodename;
@@ -182,7 +176,7 @@ MatchOpCheckNodeSet::performOperation(MatchStatus&  status, RoseAst::iterator& i
   SgNode* node=*i;
   if(node!=0) {
     // determine type name of node
-    std::string nodeTypeName=typeid(*node).name();
+    std::string nodeTypeName= node->class_name();
     if(status.debug)
       std::cout << "(" << _nodenameset << "," << nodeTypeName <<")";
     // TODO: check of all names of the nodenameset
Trass3r commented 7 months ago

I used the docker container. Quickly tested on 2 executables.

sei-eschwartz commented 7 months ago

I was able to replicate this on the docker container using ooanalyzer --threads=10 -F /tmp/facts --serialize /tmp/serialize ~/pharos/tests/ooex_vs2010/Lite/oo.exe. I was not able to replicate after removing --threads=10. This suggests to me there is a synchronization problem.

sei-eschwartz commented 7 months ago

Can also trigger via fn2hash --threads=20 ~/pharos/tests/ooex_vs2010/Lite/oo.exe

sei-mwd commented 7 months ago

Good catch. So we should throw a static mutex around the AstMatcher creation.

sei-eschwartz commented 7 months ago

I'm not sure the best way to debug this. Maybe https://rr-project.org/?

Trass3r commented 7 months ago

Indeed, without --threads it worked (after increasing per function memory limit).

sei-eschwartz commented 7 months ago

Thanks, we'll try to figure out the concurrency bug, but it's less urgent since it seems everyone can work around it.


From: Trass3r @.***> Sent: Thursday, April 25, 2024 1:00 PM To: cmu-sei/pharos Cc: Edward J Schwartz; Assign Subject: Re: [cmu-sei/pharos] Concurrency problems in fn2hash and other tools (Issue #267)

Warning: External Sender - do not click links or open attachments unless you recognize the sender and know the content is safe.

Indeed, without --threads it worked (after increasing per function memory limit).

— Reply to this email directly, view it on GitHubhttps://github.com/cmu-sei/pharos/issues/267#issuecomment-2077751737, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AL6ZAVBBXT7CXQO4XQGMDSTY7EZBTAVCNFSM6AAAAABGUPL7JKVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDANZXG42TCNZTG4. You are receiving this because you were assigned.Message ID: @.***>

edmcman commented 7 months ago

The concurrency problem appears before the AstMatcher commit :-(

sei-eschwartz commented 2 months ago

Robb Matzke suggested the problem might be related to -pthread or the lack thereof when compiling pharos or libraries. I am trying to explore this idea in the verbose branch.

sei-eschwartz commented 2 months ago

So that failed:

docker run --rm -it ghcr.io/cmu-sei/pharos:verbose bash
root@031ceda913d4:/# fn2hash --threads=20 ~/pharos/tests/ooex_vs2010/Lite/oo.exe~
OPTI[INFO ]: Analyzing executable: /root/pharos/tests/ooex_vs2010/Lite/oo.exe~
HASH[FATAL]: Pharos main error: (std::runtime_error) Could not open /root/pharos/tests/ooex_vs2010/Lite/oo.exe~ for reading
HASH[FATAL]: Backtrace:
HASH[FATAL]: | /usr/local/bin/../lib/libpharos.so(+0xf562e4) [0x7ae2b9d182e4]
HASH[FATAL]: | /lib/x86_64-linux-gnu/libstdc++.so.6(+0xbae9c) [0x7ae2afd7ee9c]
HASH[FATAL]: | /lib/x86_64-linux-gnu/libstdc++.so.6(std::unexpected()+0) [0x7ae2afd69a49]
HASH[FATAL]: | /lib/x86_64-linux-gnu/libstdc++.so.6(+0xbb128) [0x7ae2afd7f128]
HASH[FATAL]: | /usr/local/bin/../lib/libpharos.so(+0xf4e217) [0x7ae2b9d10217]
HASH[FATAL]: | /usr/local/bin/../lib/libpharos.so(pharos::SpecimenName::md5() const+0x5e) [0x7ae2ba21f26e]
HASH[FATAL]: | /usr/local/bin/../lib/libpharos.so(pharos::Specimens::unique_identifier() const+0x30) [0x7ae2ba21f6f0]
HASH[FATAL]: | fn2hash(+0x1af19) [0x6019f3167f19]
HASH[FATAL]: | fn2hash(+0x192dd) [0x6019f31662dd]
HASH[FATAL]: | /lib/x86_64-linux-gnu/libc.so.6(+0x2a1ca) [0x7ae2afaaf1ca]
HASH[FATAL]: | /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0x8b) [0x7ae2afaaf28b]
HASH[FATAL]: | fn2hash(+0x19735) [0x6019f3166735]
root@031ceda913d4:/# fn2hash --threads=20 ~/pharos/tests/ooex_vs2010/Lite/oo.exe
OPTI[INFO ]: Analyzing executable: /root/pharos/tests/ooex_vs2010/Lite/oo.exe
OPTI[INFO ]: Calculating function hashes for file: /root/pharos/tests/ooex_vs2010/Lite/oo.exe ; MD5: D3ABCCEDC43A1CE0768AA2C269A89A37
OPTI[INFO ]: ROSE stock partitioning took 17.0786 seconds.
OPTI[INFO ]: Partitioned 62182 bytes, 22150 instructions, 6561 basic blocks, 14 data blocks and 570 functions.
OPTI[INFO ]: Pharos function partitioning took 18.4577 seconds.
OPTI[INFO ]: Partitioned 67584 bytes, 23514 instructions, 7042 basic blocks, 106 data blocks and 729 functions.
OPTI[MARCH]: Function PDG analysis:  12% [##-------------] 78MATCHER parse error: line 1: syntax error at )
MATCHER parse error: line 1: syntax error at )
Segmentation fault (core dumped)

But the Docker image is using ubuntu noble boost libraries. I'm not sure if those use pthread or not, or if it should matter.

sei-eschwartz commented 2 months ago

Same problem occurs even when using our own boost :-(