Closed MaxHalford closed 4 years ago
Is this compatible with the multiple models? Could this be a setting per model?
Good question. We've brainstormed with @raphaelsty and @AdilZouitine and we don't think that would be a good idea (at least for the moment). We think that it's better to run one instance of chantilly
per task. Supporting multiple tasks makes thinks much more on our end complicated, and we kind of like the idea of running one instance of chantilly
per task.
Do you agree? As I said in another issue, nothing is set in stone yet.
To me, the philosophy of Chantilly is above all to be able to put a cream model into production easily and with the minimum of configuration. I think we have to make sure that we keep the Chantilly API very simple and intuitive to use for all. The concept of flavour is interesting because it allows clustering the different uses of Chantilly, ie the recommendation has nothing to do with regression.
I still don't see the point of having a single instance of Chantilly for multiple models that have different flavors. I see more benefits in using a modular architecture cut from the flavors. It's easier to manage in terms of maintenance, less code and we identify the problem more quickly in case of a flavor bug. This allows to deploy each model on servers adapted to the volume of client requests. Nothing is set in marble and I think we should continue to discuss of the advantages and side effect to generalize a single instance to n flavours and k models.
Raphaël :-)
Just so that I understand correctly, here is my example use case.
My model:
model = preprocessing.StandardScaler() model |= linear_model.LogisticRegression()
What I get out of creme using predict_proba_one is this: (which is exactly what I want)
{False: 0.9993760805960461, True: 0.0006239194039538452}
Is that going to be possible? As it stands today when I configure the "flavor" to regression for this example the /predict returns True/False.
In this case you're doing binary classification, therefore you need to set the flavor to binary
in order to obtain predictions.
What's happening under the hood is that the flavor determines which prediction function to use. When you set it to regression
, then the predict_one
method is used. When you set it to binary
or multiclass
, then the predict_proba_one
method is used. Therefore if you set the flavor to regression
and you upload a classifier, the predict_one
method will be used, even though the classifier has a predict_proba_one
method. This probably isn't the clearest approach, but at least everything works as expected as long as you set the flavor correctly.
I've been thinking a bit, and I think that the way forward is allow "flavors". For instance there could be the following flavors: classification, regression, recommendation, etc. This would allow to handle fine-grained behavior, validate models, and remove a lot of ifs in the code.