issues
search
google
/
maxtext
A simple, performant and scalable Jax LLM!
Apache License 2.0
1.39k
stars
247
forks
source link
Refactor permute and unpermute operations
#714
Closed
RissyRan
closed
3 weeks ago
RissyRan
commented
1 month ago
Description
Refactor permute and unpermute operations to get a better perf.
Update
rope_max_timescale
to match
HF config
from Mistral AI for both Mistral & Mixtral (thanks @ZhiyuLi-goog for bring it up).
Test
Test locally:
link
Description
rope_max_timescale
to match HF config from Mistral AI for both Mistral & Mixtral (thanks @ZhiyuLi-goog for bring it up).Test
Test locally: link