issues
search
speed1313
/
jax-llm
JAX implementation of Large Language Models. You can train GPT-2-like model with 青空文庫 (aozora bunko-clean dataset) or any other text dataset.
https://speed1313.github.io/posts/llm-from-scratch/
MIT License
10
stars
2
forks
source link
distributed training
#3
Open
speed1313
opened
6 months ago
speed1313
commented
6 months ago
Reference
https://uvadlc-notebooks.readthedocs.io/en/latest/tutorial_notebooks/scaling/JAX/overview.html
speed1313
commented
5 months ago
テンソル、パイプライン並列、FSDP
Reference