packing-box / docker-packing-box

Docker image gathering packers and tools for making datasets of packed executables and training machine learning models for packing detection
GNU General Public License v3.0
44 stars 10 forks source link

CFG extraction timeout not working #106

Open AlexVanMechelen opened 4 months ago

AlexVanMechelen commented 4 months ago

Issue

Sometimes the CFG extraction continues even after the timeout is hit here. The line Timeout reached when extracting CFG gets printed to the screen, but Angr keeps extracting the CFG, delaying the CGF-based feature computation for that executable significantly.

Reproduce

It's hard to reproduce as there is some randomness to it. It sometimes happens for an executable, but when trying again later with the same executable it stops successfully after extraction. With the tool in the latest PR #105 I started extracting the CFG-based features for a dataset of 400 samples using 32 CPU cores. The features for the first 300 executables got extracted at a rate of approximately 3 seconds per executable. For the last few executables however, the extraction time skyrockets due to this issue where CFG extraction continues even after the timeout. At this time, after 1Hr40, the features of the last 30 executables are still being extracted.

Resolve

If this issue cannot be resolved directly, maybe it's interesting to create the possibility to save progress on the dataset convert command so that the user can halt it early when some executables take a very long time to extract. Allowing the user to continue with the majority of executables for which the features got extracted, or allowing them to relaunch the conversion so that this time the CFG extraction maybe correctly halts at the timeout.

AlexVanMechelen commented 4 months ago

Testing

To test if it was slowly making progress or actually stuck, I let the extraction run for about ten hours, but no progress was made after the first 25 minutes. To check where it got stuck, I interrupted with CTRL+C and got the following traceback:

Exception ignored in: <function PagedMemoryMixin.__del__ at 0x7760402d16c0>
Traceback (most recent call last):
  File "/home/user/.local/lib/python3.11/site-packages/angr/storage/memory_mixins/paged_memory/paged_memory_mixin.py", line 58, in __del__
    page.release_shared()
  File "/home/user/.local/lib/python3.11/site-packages/angr/storage/memory_mixins/paged_memory/pages/refcount_mixin.py", line 50, in release_shared
    with self.lock:
  File "/home/user/.local/lib/python3.11/site-packages/angr/misc/picklable_lock.py", line 16, in __enter__
    return self._lock.__enter__()
           ^^^^^^^^^^^^^^^^^^^^^^
  File "/home/user/.local/lib/python3.11/site-packages/tinyscript/features/handlers.py", line 114, in __terminate_handler
    _hooks.quit(0)
  File "/home/user/.local/lib/python3.11/site-packages/tinyscript/features/handlers.py", line 64, in quit
    self.exit(code)
  File "/home/user/.local/lib/python3.11/site-packages/tinyscript/features/handlers.py", line 52, in exit
    self._orig_exit(code)
SystemExit: 0

Seems like angr gets in a deadlock when running out of memory pages (?)

Resolve

So maybe the aforementioned idea of incrementally saving progress of the extracted features to disk in a temporary folder can free up RAM and allow the complete extraction process to finish.

AlexVanMechelen commented 4 months ago

Testing

I split the dataset in two with the tool in PR #108 into two equal-sized datasets of 200 executables. I then ran the dataset convert command on both those datasets, providing some CFG-based features. One of them is (stuck) at 174/200 samples and the other at 197/200 after 2Hr30. As proposed before, if the CFG extraction timeout issue cannot be fixed, it might be useful to save progress for the executables for which the CFG-based features could be extracted, stop the dataset convert, then put -1 for all the CFG-based features for the other executables and compute the other non-CFG-based features for them.

dhondta commented 4 months ago

@AlexVanMechelen Please try. Not sure this will fix the issue but worth giving it a try.

AlexVanMechelen commented 4 months ago

@dhondta This indeed fixes the issue of angr getting into a deadlock. I tested a couple datasets and the feature extraction finishes without blocking.

The broader issue of the CFG extraction timeout which sometimes doesn't work still remains, so a small percentage of samples have a significantly longer feature extraction time than others.

PS:

Although still an issue, this doesn't block performing experiments anymore especially when using #105. With multiprocessing, the samples where the timeout doesn't work don't block others from starting. Therefore, all samples for which the timeout doesn't work get started asap and continue in parallel. An example with 12 such samples taking longer:

In an experiment with 64CPU cores and a dataset of 402 samples, the first 389 samples finished extraction in 65" (6sps), while the last 12 samples took 11'10" (0.018sps).

For reference, an experiment without #105 (1 core) took 3h33'33" for the same 402 samples (0.03sps)

dhondta commented 4 months ago

@AlexVanMechelen OK, here is the explanation ; TimeoutError could not be handled in the code section covered by the lock because it was based on a simple Lock primitive. Changing it to RLock (that is, a reentrant lock) made possible not to escape the code section (when TimeoutError was raised) without releasing the lock.