issues
search
google
/
maxtext
A simple, performant and scalable Jax LLM!
Apache License 2.0
1.44k
stars
263
forks
source link
issues
Newest
Newest
Most commented
Recently updated
Oldest
Least commented
Least recently updated
Enable expert parallelism for dropping strategy
#869
RissyRan
opened
9 hours ago
0
Unable to recover after checkpoint saving
#868
peregilk
opened
14 hours ago
0
Make running preflight optional in model scripts
#867
raymondzouu
closed
1 day ago
0
add logging statement
#866
bernardhan33
opened
2 days ago
0
Cannot see multiple GPUs when using Slurm (with proposed fix)
#865
gabeweisz
opened
2 days ago
0
Converting LLama3.1 405B checkpoint - Requesting multipass checkpoint conversion
#864
shivajid
opened
3 days ago
1
Add MaxText run name to TensorBoard file directory
#863
bvandermoon
closed
3 days ago
0
Improve tfds perf in multihost env
#862
aireenmei
opened
4 days ago
0
Fix circ storage check for delayed case
#861
gobbleturk
closed
1 day ago
0
Add load balance loss
#860
RissyRan
closed
3 days ago
0
RA update works for all axes orders
#859
patemotter
closed
1 week ago
0
Add simple MLP decoder block
#858
gobbleturk
closed
1 week ago
0
Delay Activation Forwarding
#857
gobbleturk
closed
1 week ago
1
added run_name_prefix to tensorboard
#856
kyle-google
opened
1 week ago
1
Temporarily pin google-cloud-aiplatform to 1.61.0
#855
bvandermoon
closed
1 week ago
0
[DRAFT] Add In Memory Changes for Pathways
#854
SujeethJinesh
opened
1 week ago
0
Fix kernel imports
#853
gobbleturk
closed
1 week ago
0
Add node attributes to the training benchmark
#852
bernardhan33
closed
1 week ago
0
Fix kernel imports
#851
gobbleturk
closed
1 day ago
1
Add node attributes; Fix GCS upload; Add checkpointID to checkpointing workload
#850
bernardhan33
closed
1 week ago
1
aqtp release 0.8.0 breaking dependencies
#849
bernardhan33
closed
1 week ago
1
documenting XLA flags used by MaxText
#848
nhira
closed
1 day ago
1
mlperf gpt3 ckpt permission issues
#847
gramesh-amd
opened
1 week ago
7
Add Llama2 config for v5p
#846
raymondzouu
closed
3 days ago
0
Adding Mixtral-8x22b
#845
rdyro
closed
1 day ago
1
How to load tfrecords from local file system for Mlperf training?
#844
gramesh-amd
closed
1 week ago
3
Add Gemma2-27b
#843
ZhaoyueCheng
closed
1 week ago
0
Optimize overhead right before the first train_step
#842
ZhiyuLi-goog
closed
1 week ago
0
Add dispatch and combine masks for dropping
#841
RissyRan
closed
1 week ago
1
Mlperf/4.1 grain
#840
aireenmei
opened
2 weeks ago
1
[mlperf/4.1] enable shard_in_read for large scaling training
#839
ZhiyuLi-goog
closed
2 weeks ago
1
Llama3.1 (8B,70B) 🦙
#838
khatwanimohit
opened
2 weeks ago
3
script to convert llama, mistral, mixtral checkpoints to huggingface format
#837
jwyang-google
opened
2 weeks ago
0
[gcs-team] GCS Checkpointing benchmark feature updates
#836
MattIrv
closed
2 weeks ago
1
Adds ragged attention.
#835
patemotter
closed
1 week ago
0
Integrate Badput monitoring with MaxText
#834
dipannita08
opened
2 weeks ago
0
Add dropping strategy
#833
RissyRan
closed
2 weeks ago
3
add kl divergence for forward_pass_logit_checker
#832
ZhaoyueCheng
closed
2 weeks ago
1
Standalone checkpoint write seems to have memory leak
#831
bernardhan33
opened
2 weeks ago
0
Add support for local sliding window attention in TPU splash_attention
#830
gagika
closed
2 weeks ago
0
converting Gemma maxtext compatible checkpoint to Hugging Face format
#829
salrowili
opened
3 weeks ago
1
Report hyperparamters from the distributed training benchmark workload
#828
bernardhan33
closed
3 weeks ago
0
<Do not merge> Update and rename 1024b.sh to v5p-12288.sh
#827
Obliviour
opened
3 weeks ago
0
Support AoT in 16-vm GPU Llama2 train script
#826
jonb377
closed
3 weeks ago
0
Removing the resgistration of the proxy backend used by Pathways.
#825
lukebaumann
closed
3 weeks ago
0
Update NCCL flags for A3 Mega with the network release of 6/27.
#824
yangyuwei
opened
3 weeks ago
0
[MLPerf][GPT3] Bypass setting eval_interval in using synthetic dataset
#823
ZhiyuLi-goog
closed
3 weeks ago
0
Add instruction for Mixtral
#822
RissyRan
closed
3 weeks ago
0
new features with distributed training framework
#821
bernardhan33
closed
3 weeks ago
0
chore: format the README table
#820
DemoYeti
opened
3 weeks ago
0
Next