[Pytorch][GPU][Training] Initial Release

As part of our collaboration with Google Cloud, we want to create dedicated first-party Hugging Face Deep Learning Containers, which make it easy and straightforward to train and deploy models on GKE and Vertex AI.

We want to create a dedicated Pytorch Training container that includes all important Hugging Face Ecosystem libraries, including transformers, datasets, evaluate, diffusers, trl, peft, sentence-transformers and common ML libraries, like torch, scikit-learn, tensorboard, bitsandbytes, deepspeed etc. to allow Google users to train models easily.

For this we have the following requirements:

[ ] #3
[ ] #4
[ ] maintain a container.yaml, which defines which versions should be built. (Read the version and build the correct container based on the information see https://github.com/huggingface/Google-Cloud-Containers/blob/main/containers/container.yaml)
- [ ] have a "script" (can be python) to build containers.
  - [ ] Automatic generated markdown table based on which contaienrs are available
  - [ ] Documentation on how to add and release new containers

The implementation should be easily extendable. We aim to maintain up-to-date, secured, and maintained versions of the Hugging Face DLC.

huggingface / Google-Cloud-Containers

[Pytorch][GPU][Training] Initial Release #2