automl / autoweka

Auto-WEKA
http://www.cs.ubc.ca/labs/beta/Projects/autoweka/
330 stars 105 forks source link

Difference between Performance Outcome for 3 and 6 hours #70

Closed anamhaq1 closed 5 years ago

anamhaq1 commented 5 years ago

I have run autoweka for the same datasets for 3 hours and 6 hours. The suggested model I received has a performance outcome better in 3 hours than in 6 hours. How is it possible? If Auto-Weka is able to find the best model with-in 3 hours shouldnt it would be reporting the same model after 6 hours?

larskotthoff commented 5 years ago

The process that Auto-WEKA uses to determine the best model is randomized to some extent -- you can certainly get different results in different runs. This is necessary because the entire parameter space is too large to explore exhaustively, and different randomizations allow to explore a larger part of it within a reasonable amount of time in different runs.

anamhaq1 commented 5 years ago

So from your answer, this means that there is a huge possibility that what results appear in three hours or the models suggested in 3 hours did not appear when AutoWeka was executed for 6 hours?

larskotthoff commented 5 years ago

Not necessarily huge, but the possibility exists, yes.

anamhaq1 commented 5 years ago

Is it possible to locate this randomization in the AutoWeka code, I wanted to see how this randomization works, would it be possible for you to direct me to that specific file where it has been done. I would be grateful

larskotthoff commented 5 years ago

Unfortunately it's not that easy. In addition to the randomization in Auto-WEKA (which should only affect train/test splits I think), there's the randomization in the underlying optimizer SMAC. You can probably track all of this down and fix the random seeds, but there's unfortunately no single place for this.

anamhaq1 commented 5 years ago

Is it possible to see all the combinations that AutoWeka have tried during the time provided by the user? I am curious to know how this works? Please let me know if you know of any ways to handle this

larskotthoff commented 5 years ago

Yes, you can have a look at the log files that Auto-WEKA produces. There's no way to get this information programmatically at the moment.