GoogleCloudPlatform / ai-on-gke

AI on GKE is a collection of examples, best-practices, and prebuilt solutions to help build, deploy, and scale AI Platforms on Google Kubernetes Engine
Apache License 2.0
186 stars 140 forks source link

[WIP] Add HuggingFace support for automated inference checkpoint conversion #712

Open vivianrwu opened 2 weeks ago

vivianrwu commented 2 weeks ago

This PR adds the following support:

  1. HuggingFace CLI support in checkpoint conversion script
  2. Checkpoint conversion support for Llama-2, Llama-3, and Gemma models in JetStream Pytorch
  3. Enables quantization checkpoint conversion
  4. Enables tokenizer.model upload to GSBucket
  5. Adds / renames arguments to checkpoint conversion script

This PR is currently blocked on JetStream Pytorch containers updating to v0.2.3

liurupeng commented 1 week ago

/gcbrun