sayakpaul / ml-deployment-k8s-fastapi

This project shows how to serve an ONNX-optimized image classification model as a web service with FastAPI, Docker, and Kubernetes.
https://medium.com/google-developer-experts/load-testing-tensorflow-serving-and-fastapi-on-gke-411bc14d96b2
Apache License 2.0
198 stars 36 forks source link

Refactor repo with the best FastAPI configs #38

Closed sayakpaul closed 2 years ago

sayakpaul commented 2 years ago

8 nodes, 2 vCPUs, 4 GBs of RAM.

After this, will collate your structure from the TFServing repo.

sayakpaul commented 2 years ago

@deep-diver for the purposes of this PR, I think we could ignore the changes related to experimental files under .kube.

sayakpaul commented 2 years ago

@deep-diver worked on the comments. Could you just look at the Dockerfile configuration to let me know if I am using the optimal settings for FastAPI (with uvicorn)? I consulted our Google Doc to determine that.

deep-diver commented 2 years ago

@sayakpaul

yeap correct. I think you got that right! I guess we are good to merge! just we should leave a note that k8s cluster should be provision with 8 nodes of 2vCPU with 4GB RAM.