Closed anamhaq1 closed 5 years ago
The process that Auto-WEKA uses to determine the best model is randomized to some extent -- you can certainly get different results in different runs. This is necessary because the entire parameter space is too large to explore exhaustively, and different randomizations allow to explore a larger part of it within a reasonable amount of time in different runs.
So from your answer, this means that there is a huge possibility that what results appear in three hours or the models suggested in 3 hours did not appear when AutoWeka was executed for 6 hours?
Not necessarily huge, but the possibility exists, yes.
Is it possible to locate this randomization in the AutoWeka code, I wanted to see how this randomization works, would it be possible for you to direct me to that specific file where it has been done. I would be grateful
Unfortunately it's not that easy. In addition to the randomization in Auto-WEKA (which should only affect train/test splits I think), there's the randomization in the underlying optimizer SMAC. You can probably track all of this down and fix the random seeds, but there's unfortunately no single place for this.
Is it possible to see all the combinations that AutoWeka have tried during the time provided by the user? I am curious to know how this works? Please let me know if you know of any ways to handle this
Yes, you can have a look at the log files that Auto-WEKA produces. There's no way to get this information programmatically at the moment.
I have run autoweka for the same datasets for 3 hours and 6 hours. The suggested model I received has a performance outcome better in 3 hours than in 6 hours. How is it possible? If Auto-Weka is able to find the best model with-in 3 hours shouldnt it would be reporting the same model after 6 hours?