Closed smarbal closed 1 year ago
@smarbal
If you run model browse upx-PE_pe32-pe64_99_mbkmeans_f111
, do you see the right labels in column cluster
?
@smarbal
I got it ; if you run model -v test upx-PE_pe32-pe64_99_mbkmeans_f111 upx-PE
, you will point out that true labels are all 1's instead of predicted labels (which may even be all correct). This is likely to come from a bug in label mapping of the y_true
vector. I will try to fix this ASAP.
@dhondta
The issue seems to come from line 212 in ../learning/model.py
.
After the line 209, all labels of NOT_PACKED instances are replaced by None.
But then, at line 212, the fillna()
function replaces those labels by NOT_LABELLED since those labels are None.
The mapping at line 214 can't work correctly then since NOT_PACKED instances will have a '?'
label which is not correct.
Maybe changing the value of NOT_PACKED in LABELS_BACK_CONV from None to 0 could be a solution ?
Solved with 3c8a40fcb710feef073b865022d50632da014ebb
Description
When visualizing a model, all executables appear as packed, even though it is not the case.
Steps to reproduce
dataset make upx-PE -p upx -f PE
model train upx-PE -a mbkmeans
model visualize -e upx-PE_pe32-pe64_99_mbkmeans_f111
Additional information
By printing
params['target']
in visualization.py, all labels are indeed set to 1 so it's not a visualization problem. Used datasets :fs-upx
is the fileless version of the dataset, which also yields the same bug.