packing-box / docker-packing-box

Docker image gathering packers and tools for making datasets of packed executables and training machine learning models for packing detection
GNU General Public License v3.0
49 stars 10 forks source link

Unexpected computed features for UPX #143

Closed jramhani closed 4 months ago

jramhani commented 4 months ago

Only for UPX, i get features None ...

 • entropy_code_section:                 None                                                                       
 • entropy_data_section:                 None                                                                       
<<snipped>>
 • number_addresses_in_iat:              None                                                                       

Because of that i get error such as this one : dataset plot infogain-compare not-packed --datasets upx upx_baseline -n 20

00:00:01.027 [INFO] Preparing plot data...
/home/user/.local/lib/python3.12/site-packages/sklearn/impute/_base.py:598: UserWarning: Skipping features without any observed values: ['entropy_code_section' 'entropy_data_section' 'number_addresses_in_iat']. At least one non-missing value is needed for imputation with strategy='mean'.
  warnings.warn(
Traceback (most recent call last):
  File "/home/user/.opt/tools/dataset", line 215, in <module>
    getattr(ds, args.command)(**vars(args))
  File "/home/user/.local/lib/python3.12/site-packages/pbox/core/dataset/__init__.py", line 696, in plot
    _PLOTS[subcommand](self, **kw)
  File "/home/user/.local/lib/python3.12/site-packages/pbox/helpers/figure.py", line 67, in _wrapper
    imgs = f(*a, **configure_style(**kw))
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/user/.local/lib/python3.12/site-packages/pbox/core/dataset/visualization.py", line 291, in _information_gain_comparison_heatmap
    df[feature] = SimpleImputer(missing_values=np.nan, strategy="mean").fit_transform(df[feature])
    ~~^^^^^^^^^
  File "/home/user/.local/lib/python3.12/site-packages/pandas/core/frame.py", line 4299, in __setitem__
    self._setitem_array(key, value)
  File "/home/user/.local/lib/python3.12/site-packages/pandas/core/frame.py", line 4350, in _setitem_array
    self._iset_not_inplace(key, value)
  File "/home/user/.local/lib/python3.12/site-packages/pandas/core/frame.py", line 4377, in _iset_not_inplace
    raise ValueError("Columns must be same length as key")
ValueError: Columns must be same length as key
jramhani commented 4 months ago

Fixed! Typo during ingestion of datasets