ppdebreuck / modnet

MODNet: a framework for machine learning materials properties
MIT License
80 stars 34 forks source link

`FitGenetic.run` return value if `refit=0` #235

Open kaueltzen opened 2 days ago

kaueltzen commented 2 days ago

Hi,

I have a question about the return value of run of FitGenetic if refit=0. Right now, it returns an EnsembleMODNetModel consisting of up to 10 * nested MODNetModels. https://github.com/ppdebreuck/modnet/blob/e2475dcef9f6a983943863977ea6bb0dce6c4d3a/modnet/hyper_opt/fit_genetic.py#L660-L667

I'd expect self.best_model = models[ranking[0]]. Could you please comment on that @ppdebreuck @ml-evs ? Thanks a lot!

ppdebreuck commented 19 hours ago

The idea is to make an ensemble with some randomness on the committee members for more robust predictions and uncertainty assessment. People do bootstrapping, initial seed variations, different architectures, ... Here, we keep the 10 best models, each of them fitted on an inner fold (given by nested, typically 5, giving an ensemble of 50 by default). This is mainly by simplicity and speed gain as the models have already been fitted by the GA.

if you only want the best architecture, refit>0 will do this, but will take longer as the model needs to be refitted refit times.