huggingface / Google-Cloud-Containers

Hugging Face Deep Learning Containers (DLCs) for Google Cloud
https://hf.co/docs/google-cloud
Apache License 2.0
128 stars 16 forks source link

[Pytorch][GPU][Training] Initial Release #2

Closed philschmid closed 2 months ago

philschmid commented 9 months ago

As part of our collaboration with Google Cloud, we want to create dedicated first-party Hugging Face Deep Learning Containers, which make it easy and straightforward to train and deploy models on GKE and Vertex AI.

We want to create a dedicated Pytorch Training container that includes all important Hugging Face Ecosystem libraries, including transformers, datasets, evaluate, diffusers, trl, peft, sentence-transformers and common ML libraries, like torch, scikit-learn, tensorboard, bitsandbytes, deepspeed etc. to allow Google users to train models easily.

For this we have the following requirements:

The implementation should be easily extendable. We aim to maintain up-to-date, secured, and maintained versions of the Hugging Face DLC.

alvarobartt commented 2 months ago

Closing this issue as the PyTorch Training DLC has already been released at us-docker.pkg.dev/deeplearning-platform-release/gcr.io/huggingface-pytorch-training-cu121.2-3.transformers.4-42.ubuntu2204.py310, even though some of the tasks defined within the bullet points are missing, but those will be tackled in separate issues / PRs.