issues
search
AI-Hypercomputer
/
maxtext
A simple, performant and scalable Jax LLM!
Apache License 2.0
1.47k
stars
275
forks
source link
issues
Newest
Newest
Most commented
Recently updated
Oldest
Least commented
Least recently updated
Enable expert parallelism for dropping strategy
#869
RissyRan
closed
2 weeks ago
0
Unable to recover after checkpoint saving
#868
peregilk
opened
2 weeks ago
2
Make running preflight optional in model scripts
#867
raymondzouu
closed
2 weeks ago
0
test code to produce Lab Notes - 2024-09-07.ipynb
#866
bernardhan33
opened
2 weeks ago
0
Cannot see multiple GPUs when using Slurm (with proposed fix)
#865
gabeweisz
opened
2 weeks ago
0
Converting LLama3.1 405B checkpoint - Requesting multipass checkpoint conversion
#864
shivajid
closed
1 week ago
3
Add MaxText run name to TensorBoard file directory
#863
bvandermoon
closed
3 weeks ago
0
Improve tfds perf in multihost env
#862
aireenmei
closed
2 weeks ago
0
Fix circ storage check for delayed case
#861
gobbleturk
closed
2 weeks ago
0
Add load balance loss
#860
RissyRan
closed
3 weeks ago
0
RA update works for all axes orders
#859
patemotter
closed
3 weeks ago
0
Add simple MLP decoder block
#858
gobbleturk
closed
3 weeks ago
0
Delay Activation Forwarding
#857
gobbleturk
closed
3 weeks ago
1
added run_name_prefix to tensorboard
#856
kyle-google
closed
2 weeks ago
1
Temporarily pin google-cloud-aiplatform to 1.61.0
#855
bvandermoon
closed
3 weeks ago
0
[DRAFT] Add In Memory Changes for Pathways
#854
SujeethJinesh
opened
3 weeks ago
0
Fix kernel imports
#853
gobbleturk
closed
3 weeks ago
0
Add node attributes to the training benchmark
#852
bernardhan33
closed
3 weeks ago
0
Fix kernel imports
#851
gobbleturk
closed
2 weeks ago
1
Add node attributes; Fix GCS upload; Add checkpointID to checkpointing workload
#850
bernardhan33
closed
4 weeks ago
1
aqtp release 0.8.0 breaking dependencies
#849
bernardhan33
closed
4 weeks ago
1
documenting XLA flags used by MaxText
#848
nhira
closed
2 weeks ago
1
mlperf gpt3 ckpt permission issues
#847
gramesh-amd
closed
2 weeks ago
11
Add Llama2 config for v5p
#846
raymondzouu
closed
3 weeks ago
0
Adding Mixtral-8x22b
#845
rdyro
closed
2 weeks ago
2
How to load tfrecords from local file system for Mlperf training?
#844
gramesh-amd
closed
4 weeks ago
3
Add Gemma2-27b
#843
ZhaoyueCheng
closed
3 weeks ago
0
Optimize overhead right before the first train_step
#842
ZhiyuLi-goog
closed
1 month ago
0
Add dispatch and combine masks for dropping
#841
RissyRan
closed
3 weeks ago
1
Mlperf/4.1 grain
#840
aireenmei
opened
1 month ago
1
[mlperf/4.1] enable shard_in_read for large scaling training
#839
ZhiyuLi-goog
closed
1 month ago
1
Llama3.1 (8B,70B,405B) 🦙
#838
khatwanimohit
opened
1 month ago
3
script to convert llama, mistral, mixtral checkpoints to huggingface format
#837
jwyang-google
closed
1 week ago
2
[gcs-team] GCS Checkpointing benchmark feature updates
#836
MattIrv
closed
1 month ago
1
Adds ragged attention.
#835
patemotter
closed
4 weeks ago
0
Integrate Badput monitoring with MaxText
#834
dipannita08
closed
6 days ago
0
Add dropping strategy
#833
RissyRan
closed
1 month ago
3
add kl divergence for forward_pass_logit_checker
#832
ZhaoyueCheng
closed
1 month ago
1
Standalone checkpoint write seems to have memory leak
#831
bernardhan33
opened
1 month ago
1
Add support for local sliding window attention in TPU splash_attention
#830
gagika
closed
1 month ago
0
converting Gemma maxtext compatible checkpoint to Hugging Face format
#829
salrowili
opened
1 month ago
3
Report hyperparamters from the distributed training benchmark workload
#828
bernardhan33
closed
1 month ago
0
<Do not merge> Update and rename 1024b.sh to v5p-12288.sh
#827
Obliviour
opened
1 month ago
0
Support AoT in 16-vm GPU Llama2 train script
#826
jonb377
closed
1 month ago
0
Removing the resgistration of the proxy backend used by Pathways.
#825
lukebaumann
closed
1 month ago
0
Update NCCL flags for A3 Mega with the network release of 6/27.
#824
yangyuwei
opened
1 month ago
0
[MLPerf][GPT3] Bypass setting eval_interval in using synthetic dataset
#823
ZhiyuLi-goog
closed
1 month ago
0
Add instruction for Mixtral
#822
RissyRan
closed
1 month ago
0
new features with distributed training framework
#821
bernardhan33
closed
1 month ago
0
chore: format the README table
#820
DemoYeti
opened
1 month ago
0
Previous
Next