issues
search
google
/
maxtext
A simple, performant and scalable Jax LLM!
Apache License 2.0
1.39k
stars
247
forks
source link
issues
Newest
Newest
Most commented
Recently updated
Oldest
Least commented
Least recently updated
Add convergence tests on A3 GPU
#746
michelle-yooh
closed
3 weeks ago
0
Fix simple test step count
#745
gobbleturk
closed
3 weeks ago
0
[DON'T MERGE] GCS Distributed Training Benchmark Infra + File-parallelism + Range-read Parquet files
#744
bernardhan33
opened
3 weeks ago
0
Install Transformer Engine from github for stable and nightly mode
#743
michelle-yooh
closed
6 days ago
0
Adding support for creating maxtext docker images with jax-ss
#742
parambole
closed
5 days ago
0
Clean up MoE brute force implementation
#741
RissyRan
closed
3 weeks ago
0
Implement restore with Orbax emergency checkpoint manager
#740
xuefgu
closed
3 weeks ago
0
Handle cases where memstats are not available for the device.
#739
lukebaumann
closed
2 weeks ago
0
Support eval dataset and refactor
#738
aireenmei
closed
3 weeks ago
0
Adding option for int4 quantization to kvcache.
#737
singh-mitali
closed
3 weeks ago
1
Support target masking (aka loss masking or label masking) for SFT datasets
#736
jmschndev
opened
3 weeks ago
0
Inconsistent code formatting
#735
jmschndev
opened
3 weeks ago
0
`hf_access_token` only effective for loading gated datasets, not gated tokenizers
#734
jmschndev
opened
3 weeks ago
0
Gemma 2 support
#733
borisdayma
opened
3 weeks ago
2
Fix and protect simple_layer
#732
gobbleturk
closed
3 weeks ago
0
Load/Save Aqt quantized checkpoint.
#731
singh-mitali
closed
3 weeks ago
0
Support partial overrides for logical_axis_rules.
#730
golechwierowicz
closed
2 weeks ago
2
Allow owners to have any approver
#729
gobbleturk
closed
3 weeks ago
0
LLama3-70B model config
#728
khatwanimohit
closed
1 week ago
0
Prefill return first token
#727
jwyang-google
closed
3 weeks ago
0
Add Llama2 7B, 13B high performance training configs
#726
raymondzouu
closed
3 weeks ago
0
Enable loading and saving quantized decode checkpoint.
#725
singh-mitali
closed
3 weeks ago
4
Update the dependencies to prepare for integration of emergency checkpointing
#724
xuefgu
closed
4 weeks ago
0
Fix Mesh setup for multiprocess CPUs.
#723
RoshaniN
closed
4 weeks ago
0
Allow owners to have any approver
#722
gobbleturk
closed
3 weeks ago
0
Use dedicated requirements for GPUs
#721
xuefgu
closed
4 weeks ago
0
Enable saving using Orbax's emergency checkpoint manager
#720
xuefgu
closed
3 weeks ago
0
Add mistral tokenizer to maxtext/assets
#719
vipannalla
closed
4 weeks ago
0
Add mixtral tokenizer to maxtext/assets
#718
vipannalla
closed
1 month ago
0
Move Stage to second axis
#717
gobbleturk
closed
1 month ago
0
fix data loading from HF hub
#716
aireenmei
closed
1 month ago
0
Try a token_path relative to the base config path if it can't be found
#715
obrienadam
closed
4 weeks ago
0
Refactor permute and unpermute operations
#714
RissyRan
closed
3 weeks ago
0
Add maxengine_server configs to base.yml
#713
gobbleturk
closed
1 month ago
0
change the sweeping to prefill and ar cache only
#712
morgandu
closed
1 month ago
0
Eval on C4?
#711
tjingrant
closed
3 weeks ago
1
Explicitly specify tokenizer_path for pipeline tests
#710
gobbleturk
closed
1 month ago
0
Allow Different Compute Layout for Attention
#709
morgandu
closed
1 month ago
0
Allow Quantize KV Cache over Multiple Dimensions
#708
morgandu
closed
4 weeks ago
1
MaxText package
#707
khatwanimohit
closed
1 month ago
0
Sharding the llama2 70b on v5e-16 more efficiently.
#706
zhihaoshan-google
closed
1 month ago
2
Add FSDP + Megablox
#705
RissyRan
closed
1 month ago
0
Update MaxText config for Llama2 7B on GPUs.
#704
yangyuwei
opened
1 month ago
0
Add compute_axis_order
#703
morgandu
closed
1 month ago
0
fix tfds instruction and some typos
#702
aireenmei
closed
1 month ago
0
Circular Pipelining
#701
gobbleturk
closed
1 month ago
0
initial commit
#700
Bslabe123
closed
1 month ago
0
Pipeline paralleism (Linear only)
#699
gobbleturk
closed
1 month ago
2
Save save linear
#698
gobbleturk
closed
1 month ago
0
Enable async checkpointing for GPU.
#697
yangyuwei
closed
4 weeks ago
1
Previous
Next