packing-box / docker-packing-box

Docker image gathering packers and tools for making datasets of packed executables and training machine learning models for packing detection
GNU General Public License v3.0
44 stars 10 forks source link

Handle `ZeroDivisionError` in feature evaluation #119

Closed AlexVanMechelen closed 4 months ago

AlexVanMechelen commented 4 months ago

Possible enhancement

Make the default feature value -1 when a ZeroDivisionError occurs for ratio features. Another value can be chosen if we want to distinguish this error from others leading to a -1 feature value.

Example

[WARNING] Bad expression: len(binary['cfg']['mal_used_apis']) / len(binary['cfg']['used_apis'])
00:00:06.524 [ERROR] division by zero
Traceback (most recent call last):
  File "/home/user/.local/lib/python3.11/site-packages/pbox/helpers/items.py", line 89, in _exec
    r = eval2(expr, d, {}, whitelist_nodes=WL_NODES + _WL_EXTRA_NODES)
        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/user/.local/lib/python3.11/site-packages/tinyscript/helpers/expressions.py", line 122, in eval2
    return __eval(expression, globals, locals, blacklist_builtins, whitelist_nodes)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/user/.local/lib/python3.11/site-packages/tinyscript/helpers/expressions.py", line 58, in __eval
    return eval(expr, globals, locals)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "<string>", line 1, in <module>
ZeroDivisionError: division by zero

I tried solving the ZeroDivisionError issue in the feature definition by adding if binary['cfg']['used_apis'] else -1, but that didn't work. And then I figured it might be interesting to handle this error by default instead of doing it feature per feature.

dhondta commented 4 months ago

@AlexVanMechelen See PR #123 ; this will allow to set None for undefined feature values without breaking features computation process.