ppdebreuck / modnet

MODNet: a framework for machine learning materials properties
MIT License
80 stars 34 forks source link

`predict` method of `EnsembleMODNetModel` does not correspond to majority vote for classification tasks #225

Open kaueltzen opened 3 weeks ago

kaueltzen commented 3 weeks ago

Hi,

thanks for the nice package!

I have a remark about the EnsembleMODNetModel class: the predict method always corresponds to the mean of the predictions of the ensemble:

https://github.com/ppdebreuck/modnet/blob/e14188d3b8a036bba0a1d9c0a3f538dc58c3cd29/modnet/models/ensemble.py#L178

However, for classification tasks, this may result in non-integer labels and one would, for example, fail to compute an f1 score with such a prediction. If you are interested, I would be happy to implement hard and / or soft voting for classification ensembles here.

ppdebreuck commented 1 week ago

Hi @kaueltzen ! This is indeed true, however one could still convert it to discrete classes by putting a threshold. But I agree the latter makes more sense to be used in combinateion of return_prob=True. So what you suggest (hard and / or soft voting) makes sense! Please go ahead with what makes sense to you and happy to review :)