mandiant / flare-floss

FLARE Obfuscated String Solver - Automatically extract obfuscated strings from malware.
Apache License 2.0
3.12k stars 448 forks source link

Rust performance issues #873

Open mr-tz opened 10 months ago

mr-tz commented 10 months ago

On some random Rust samples from Virustotal:

09d923348d083c130e8e26f990ba4c90022a2b26e6a9f0c55ac78420f84089f8
INFO: floss.language.identify: Rust binary found with version: 1.54.0
INFO: floss: extracting language-specific Rust strings
INFO: floss: finished execution after 663.86 seconds
floss.exe   0.00s user 0.08s system 0% cpu 11:11.48 total

2a86a3521809e73f487843818f6f03326affc3bb8ad0ec8cfd704da9a7d7a89c
INFO: floss.language.identify: Rust binary found with version: 1.69.0
INFO: floss: extracting language-specific Rust strings
INFO: floss: finished execution after 447.25 seconds
floss.exe   0.02s user 0.06s system 0% cpu 7:33.31 total

2d6890216b0ecfb02e675e86a5da02686860a8496f9dce90c701427047b89551
INFO: floss.language.identify: Rust binary found with version: 1.65.0
INFO: floss: extracting language-specific Rust strings
INFO: floss: finished execution after 888.46 seconds
floss.exe   0.01s user 0.12s system 0% cpu 15:00.61 total

36dd111ae29579e992b983be5928d7fbf796b36f8d59f814d876f0a8313a7e9f
INFO: floss.language.identify: Rust binary found with version: 1.54.0
INFO: floss: extracting language-specific Rust strings
INFO: floss: finished execution after 8443.80 seconds
floss.exe   0.02s user 0.52s system 0% cpu 2:21:19.81 total

3a666472ac16244839210fe888fb975e2433745cf0c3ad8a698152e4dfebb49d
INFO: floss.language.identify: Rust binary found with version: 1.69.0
INFO: floss: extracting language-specific Rust strings
INFO: floss: finished execution after 494.78 seconds
floss.exe   0.00s user 0.08s system 0% cpu 8:22.33 total

Most other samples complete in 2-10 seconds, so these are extremes.

mr-tz commented 8 months ago

Current times are better already:

09d923348d083c130e8e26f990ba4c90022a2b26e6a9f0c55ac78420f84089f8 (8.3 MB)
INFO: floss.language.identify: Rust binary found with version: 1.54.0
INFO: floss: extracting language-specific Rust strings
INFO: floss: finished execution after 255.98 seconds
floss.exe   0.00s user 0.03s system 0% cpu 4:18.38 total

2a86a3521809e73f487843818f6f03326affc3bb8ad0ec8cfd704da9a7d7a89c (8.5 MB)
INFO: floss.language.identify: Rust binary found with version: 1.69.0
INFO: floss: extracting language-specific Rust strings
INFO: floss: finished execution after 97.34 seconds
floss.exe   0.00s user 0.03s system 0% cpu 1:39.56 total

2d6890216b0ecfb02e675e86a5da02686860a8496f9dce90c701427047b89551 (16 MB)
INFO: floss.language.identify: Rust binary found with version: 1.65.0
INFO: floss: extracting language-specific Rust strings
INFO: floss: finished execution after 116.99 seconds
floss.exe   0.01s user 0.04s system 0% cpu 2:00.62 total

36dd111ae29579e992b983be5928d7fbf796b36f8d59f814d876f0a8313a7e9f (56 MB)
INFO: floss.language.identify: Rust binary found with version: 1.54.0
INFO: floss: extracting language-specific Rust strings
INFO: floss: finished execution after 950.56 seconds
floss.exe   0.03s user 0.17s system 0% cpu 16:02.20 total

3a666472ac16244839210fe888fb975e2433745cf0c3ad8a698152e4dfebb49d (11 MB)
INFO: floss.language.identify: Rust binary found with version: 1.69.0
INFO: floss: extracting language-specific Rust strings
INFO: floss: finished execution after 99.19 seconds
floss.exe   0.00s user 0.04s system 0% cpu 1:41.79 total
mr-tz commented 8 months ago

One detailed view for 2a86a3521809e73f487843818f6f03326affc3bb8ad0ec8cfd704da9a7d7a89c where the core extraction routine is slow:

rust-slow-2a86.prof% sort tottime
rust-slow-2a86.prof% stats 15
Fri Nov 10 14:44:47 2023    rust-slow-2a86.prof

         458060720 function calls (457467510 primitive calls) in 157.274 seconds

   Ordered by: internal time
   List reduced from 3984 to 15 due to restriction <15>

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
    34757   78.460    0.002  136.269    0.004 flare-floss\floss\language\rust\extract.py:84(split_strings)
126764443   31.832    0.000   31.832    0.000 <string>:2(__eq__)
321484351/321482973   19.054    0.000   19.054    0.000 {built-in method builtins.len}
        1   14.209   14.209   14.250   14.250 flare-floss\floss\language\utils.py:645(get_missed_strings)
     4079    6.919    0.002   22.937    0.006 {method 'remove' of 'list' objects}
    93106    1.061    0.000    1.261    0.000 flare-floss\floss\strings.py:31(extract_ascii_strings)
 462200/1    0.805    0.000    3.204    3.204 Python\Python310\lib\dataclasses.py:1241(_asdict_inner)
462190/346647    0.769    0.000    1.419    0.000 Python\Python310\lib\copy.py:128(deepcopy)
       80    0.280    0.004    0.280    0.004 flare-floss\floss\strings.py:58(extract_unicode_strings)
   115987    0.213    0.000    0.642    0.000 Python\Python310\lib\copy.py:259(_reconstruct)
   468546    0.188    0.000    0.334    0.000 {built-in method builtins.hasattr}
   248819    0.172    0.000    0.241    0.000 Python\Python310\lib\enum.py:359(__call__)
   115556    0.169    0.000    0.268    0.000 Python\Python310\lib\dataclasses.py:1187(fields)
    24044    0.157    0.000    0.186    0.000 flare-floss\floss\language\utils.py:70(find_amd64_lea_xrefs)
        1    0.157    0.157    0.183    0.183 flare-floss\floss\language\rust\extract.py:63(filter_and_transform_utf8_strings)