huggingface / optimum-neuron

Easy, fast and very cheap training and inference on AWS Trainium and Inferentia chips.
Apache License 2.0
195 stars 59 forks source link

Dealing with custom tokens using the provided example notebook #637

Closed BaiqingL closed 3 months ago

BaiqingL commented 3 months ago

System Info

Standard trn1n.32xlarge instance with the huggingface AMI image.

Who can help?

No response

Information

Tasks

Reproduction (minimal, reproducible, runnable)

Use the notebook provided in the repo on fine-tuning llama2, but instead use llama 3 and add 3 custom tokens and during the pre-compile stage it seems to run in a division by 8 error. Seems like bug report #175 is related but I'm not sure how to modify the provided notebook so that it will work here.

Expected behavior

Running as the notebook intended

BaiqingL commented 3 months ago

I get that it's a tensor parallelism setting and I can modify that, but what is the standard or proper way of dealing with these kind of token embedding sizes?