Closed ghost closed 7 years ago
I still encountered the issue after changing the conditionals to reflect the intention of the author (appending). The reason has to do with the ensuing for loop:
# Apply filters to selected datasets
filters = [i.strip() for i in cfg.cuckooml.features_filter.split(",")]
data = []
for f, d in itertools.izip(filters, selected_features):
if f == "log_bin":
data.append(d.applymap(ml.__log_bin))
elif f == "filter_dataset": #D: Only runs once!
print "RUNs\n"
data.append(ml.filter_dataset(d))
If you only specified "filter_dataset" in the configuration, this for loop will only run once. To get around that, just add another element to the filters data structure in the configuration file:
features_filter = filter_dataset, filter_dataset
Hi, well spotted. This would not work if you put more than one set of features in the config file. I have fixed this with 859bec5.
You are right, the configuration file takes pairs of: feature, fileter; e.g.
features = simple, nominal
features_filter = filter_dataset, filter_dataset
Hi,
thanks for sharing this project! I am in the process of adding features to the nominal feature set. In that process I noticed that my changes were not taken into account in the clustering results, even though I specified nominal in the configuration. I believe the reason is that the code that handles the configuration settings is using an if... elif construct, which will lead to only choosing one set of features. Relevant code snippet is: