online-ml / chantilly

🍦 Deployment tool for online machine learning models
BSD 3-Clause "New" or "Revised" License
97 stars 17 forks source link

Allow flavors #6

Closed MaxHalford closed 4 years ago

MaxHalford commented 4 years ago

I've been thinking a bit, and I think that the way forward is allow "flavors". For instance there could be the following flavors: classification, regression, recommendation, etc. This would allow to handle fine-grained behavior, validate models, and remove a lot of ifs in the code.

kevinsteger commented 4 years ago

Is this compatible with the multiple models? Could this be a setting per model?

MaxHalford commented 4 years ago

Good question. We've brainstormed with @raphaelsty and @AdilZouitine and we don't think that would be a good idea (at least for the moment). We think that it's better to run one instance of chantilly per task. Supporting multiple tasks makes thinks much more on our end complicated, and we kind of like the idea of running one instance of chantilly per task.

Do you agree? As I said in another issue, nothing is set in stone yet.

raphaelsty commented 4 years ago

To me, the philosophy of Chantilly is above all to be able to put a cream model into production easily and with the minimum of configuration. I think we have to make sure that we keep the Chantilly API very simple and intuitive to use for all. The concept of flavour is interesting because it allows clustering the different uses of Chantilly, ie the recommendation has nothing to do with regression.

I still don't see the point of having a single instance of Chantilly for multiple models that have different flavors. I see more benefits in using a modular architecture cut from the flavors. It's easier to manage in terms of maintenance, less code and we identify the problem more quickly in case of a flavor bug. This allows to deploy each model on servers adapted to the volume of client requests. Nothing is set in marble and I think we should continue to discuss of the advantages and side effect to generalize a single instance to n flavours and k models.

Raphaël :-)

kevinsteger commented 4 years ago

Just so that I understand correctly, here is my example use case.

My model:

model = preprocessing.StandardScaler() model |= linear_model.LogisticRegression()

What I get out of creme using predict_proba_one is this: (which is exactly what I want)

{False: 0.9993760805960461, True: 0.0006239194039538452}

Is that going to be possible? As it stands today when I configure the "flavor" to regression for this example the /predict returns True/False.

MaxHalford commented 4 years ago

In this case you're doing binary classification, therefore you need to set the flavor to binary in order to obtain predictions.

What's happening under the hood is that the flavor determines which prediction function to use. When you set it to regression, then the predict_one method is used. When you set it to binary or multiclass, then the predict_proba_one method is used. Therefore if you set the flavor to regression and you upload a classifier, the predict_one method will be used, even though the classifier has a predict_proba_one method. This probably isn't the clearest approach, but at least everything works as expected as long as you set the flavor correctly.