automl / ParameterImportance

Parameter Importance Analysis Tool
http://www.ml4aad.org/
BSD 3-Clause "New" or "Revised" License
76 stars 19 forks source link

--marginalize_over_instances option and fanova results #88

Open ndangtt opened 6 years ago

ndangtt commented 6 years ago

Hello,

When I apply fanova on my data with and without specifying "--marginalize_over_instances", the list of the most important parameters are very different. I'm just wondering why it is the case. To my understanding, marginalization over instances for fanova analysis is always done in both cases. If that is not correct and they are meant to be different, which results should I use?

Please find in attachment the example data I'm using. There are 4 parameters and 4985 data points. The data may look a bit artificial because it was tuning data given by irace and was converted to smac output format. The commands I use and the resulting list of important parameters are listed below:

pimp -S scenario.txt -H runhistory.json -M fanova

LL_static_crossOverBias 84.193 LL_static_lda 2.270 LL_static_lambda1 0.750 LL_static_lambda2 0.555

pimp -S scenario.txt -H runhistory.json -M fanova --marginalize_over_instances

LL_static_lambda1 69.746 LL_static_lambda2 5.647 LL_static_crossOverBias 3.680 LL_static_lda 0.699

Many thanks, Nguyen

AndreBiedenkapp commented 6 years ago

Hi, thank you for pointing that out. In that case you should go with the run without specifying the marginalization flag.

I'm not sure what's going wrong but for fANOVA you don't have to specify the marginalization

ndangtt commented 6 years ago

Thanks very much for the prompt reply! I will use the one without the flag then.

One thing that might be worth noted is that when I directly apply the previous version of fanova (the Java-based one, I think) on the same data example (under smac2 format), results are quite similar to the new one given by pimp with the marginalization flag. The data I used, including fanova output, can be found here. This is with my local copy of fanova a long time ago (version 1.0) as I get error messages when running the lastest fanova on it and can't figure out why.