aws / deep-learning-containers

AWS Deep Learning Containers (DLCs) are a set of Docker images for training and serving models in TensorFlow, TensorFlow 2, PyTorch, and MXNet.
https://docs.aws.amazon.com/deep-learning-containers/latest/devguide/deep-learning-containers-images.html
Other
994 stars 454 forks source link

[bug] Different libraries versions in pytorch training and inference containers #1122

Open luigift opened 3 years ago

luigift commented 3 years ago

Concise Description: Training and inference images should have the same library versions

DLC image/dockerfile: https://github.com/aws/deep-learning-containers/blob/3b5a6247b033e5ac5abc08282145609853119aac/pytorch/inference/docker/1.8/py3/Dockerfile.cpu#L91

https://github.com/aws/deep-learning-containers/blob/3b5a6247b033e5ac5abc08282145609853119aac/pytorch/training/docker/1.8/py3/Dockerfile.cpu#L94

Current behavior: Different scikit-learn versions

Expected behavior: Equal python library versions during training and inference

tejaschumbalkar commented 1 year ago

Thanks for reporting the issue. I want to understand more on your use-case, is it just a recommendation or does it affects your workflow? In general, I agree that with your suggestion and checking our latest pytorch DLC v1.13, we do install the latest version of scikit-learn for both training and inference DLC.

luigift commented 1 year ago

Thank you for the reply. It was breaking my workflow, however, it's no longer in production. I believe this could be impacting other users as the image remains available.

Good to know that new versions have same scikit-learn library version. Although, I understand the advantages of having two Dockerfiles for inference and training, I believe that there should be a better mechanism - i.e. unit/funcional testing or inheriting a parent image - that ensures library compatibility.