wenxindongwork commented 3 months ago

Use transformers' AutoModelForCausalLM instead of optimum-tpu's AutoModelForCausalLM for finetuning.

The from optimum.tpu version imports models that are specifically optimized for inference. While the colab example works for smaller models, it fails with a HBM OOM error for llama3-70b (on a v4-256). Changing the following import statement solved the problem.

Before submitting

[x] This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
[x] Did you make sure to update the documentation with your changes?
[x] Did you write any new necessary tests?

wenxindongwork commented 3 months ago

Just updated the examples to load the models in bf16 instead, hope that works!

HuggingFaceDocBuilderDev commented 3 months ago

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

tengomucho commented 3 months ago

The code style workflow is failing, can you run it locally (make style) and push again so we can merge this please?

wenxindongwork commented 3 months ago

just installed ruff and ran make sytle. Thanks!

huggingface / optimum-tpu

Update colab examples #86

Use transformers' AutoModelForCausalLM instead of optimum-tpu's AutoModelForCausalLM for finetuning.

Before submitting