Closed philschmid closed 2 months ago
Closing this issue as the PyTorch Training DLC has already been released at us-docker.pkg.dev/deeplearning-platform-release/gcr.io/huggingface-pytorch-training-cu121.2-3.transformers.4-42.ubuntu2204.py310, even though some of the tasks defined within the bullet points are missing, but those will be tackled in separate issues / PRs.
As part of our collaboration with Google Cloud, we want to create dedicated first-party Hugging Face Deep Learning Containers, which make it easy and straightforward to train and deploy models on GKE and Vertex AI.
We want to create a dedicated Pytorch Training container that includes all important Hugging Face Ecosystem libraries, including
transformers
,datasets
,evaluate
,diffusers
,trl
,peft
,sentence-transformers
and common ML libraries, liketorch
,scikit-learn
,tensorboard
,bitsandbytes
,deepspeed
etc. to allow Google users to train models easily.For this we have the following requirements:
container.yaml
, which defines which versions should be built. (Read the version and build the correct container based on the information see https://github.com/huggingface/Google-Cloud-Containers/blob/main/containers/container.yaml)The implementation should be easily extendable. We aim to maintain up-to-date, secured, and maintained versions of the Hugging Face DLC.