replit / ReplitLM

Inference code and configs for the ReplitLM model family
https://huggingface.co/replit
Apache License 2.0
923 stars 77 forks source link

Fix triton example in readme #5

Closed tanmay-bakshi closed 1 year ago

tanmay-bakshi commented 1 year ago

The input token IDs should be long, not bfloat16, when using the Triton attention implementation, as they're fed to an embedding layer.

pirroh commented 1 year ago

Thanks for catching this, Tanmay! Merging your fix here, and mirroring also to our HuggingFace repo.