Open FuriouslyCurious opened 8 years ago
Hi, thanks for your interest in our project. While I think that factorization machines would be a helpful model, they should actually live in the scikit-learn package. We try to not implement machine learning algorithms ourselves, and are actually working on removing all custom implementations from the auto-sklearn code again. Having said this, we're open to the contribution of models from scikit-learn.
If you instead want to use auto-sklearn with factorization machines, this page of the documentation will guide you through the process of doing so.
IMO it's not a bad idea to implement packages outside of scikit-learn as long as they fit the scikit-learn interface (init, fit, predict/transform, etc.). Arguably, to push the boundaries of AutoML, we will need to implement advanced pipeline operators that aren't supported in scikit-learn.
@rhiever, that is already the case. XGBoost is part of the stack and it's not embedded in scikit-learn, although it does have scikit-learn wrappers.
Yes, XGBoost is in there.
The reason why I'm conservative about this is because I'm not sure how to easily maintain the dependency on additional packages. This might be an issue with the unit tests in auto-sklearn, but this is what holds me back from doing this.
Hi AutoML team, I can chip in some code and add more algorithm "building blocks" to autosklearn. For example, I can add Factorization Machines for classification problems.
What is a good place to start with this? Any code-traditions that you would like me to follow?