jadianes / spark-py-notebooks

Apache Spark & Python (pySpark) tutorials for Big Data Analysis and Machine Learning as IPython / Jupyter notebooks
http://jadianes.github.io/spark-py-notebooks
Other
1.64k stars 918 forks source link

Question on: Pyspark MLib Model want to deploy on docker, But the performance is out of expectation #14

Open robotsp opened 5 years ago

robotsp commented 5 years ago

Env: spark standalone on docker

Case: the trained pyspark model (randomforest) deployed on docker

Questions: When I use gunicorn to start the service, including (model loading, prediction) and expose API service with Python Flask framework, it seems pretty slow to call the api..

Could I get your help or any suggestions on spark model deployment? Thanks!