Closed appunni-m closed 4 years ago
@appunni-dishq Thanks for using Sagemaker and providing us feedback. We kept the serving and training support together because the limitation of our current design. Our out of the box containers have training and serving functionalities bundles and it's awkward/impossible to use different containers for training and serving with our frond end tools including sagemaker-python-sdk and the aws console. That said it's not blocking us from separating the serving and training dependencies here. It's a good idea and we will consider this suggestion and keep you updated.
As I went through the entire project, I understood the only thing it lacks is a proper documentation. I have already integrated this into our system. I am running inference code only from SageMaker currently. So I believe it has the potential to become a much bigger platform for serving machine learning models in production. I was surprised to see the gunicorn gevent implementation as I was earlier running single-threaded development Flask app in production. Dependency loading was the single most difficult thing I faced in making machine learning APIs. And you have solved it beautifully. I see a lot of potential in just the inference code as it is more important to be able to serve models in production than training and then deploying. I have abstracted away most of the code for creating ECR repository, creating versioning and deployment of endpoint and endpoint lifecycle management in my project. Which reduces cost as we are using these systems to calculate and store than actually serve to a frontend facing client.