combust / mleap

MLeap: Deploy ML Pipelines to Production
https://combust.github.io/mleap-docs/
Apache License 2.0
1.5k stars 311 forks source link

MLeap's value for sklearn #430

Open mingmasplace opened 5 years ago

mingmasplace commented 5 years ago

MLeap solves the single-request low latency prediction problem for Spark pipeline. Quick test shows sklearn native pipeline.predict has pretty good latency < 3ms(sure it depends on the number of transforms). So why would people want to migrate the existing sklearn online prediction to MLeap? Thanks.

ancasarb commented 5 years ago

In our use case, we had to support model building/training not just in scikit-learn, but also in Spark and Tensorflow, so MLeap helped in this case, because at scoring time, you need to worry about monitoring and scalability of a single model scoring service. At the same time, with MLeap we expose a unified scoring interface, so clients which integrate with the scoring service don't need to know/worry about whether it's a Spark model they're using or scikit-learn etc. And this makes switching between models, with an A/B test for example, very easy.

Hope this helps!

mingmasplace commented 5 years ago

Thanks Anca. So MLeap not only solves the latency issue with Spark, but also provides an unified online scoring service that is model-building ML framework agnostic. Still it isn't clear why that is a problem.