issues
search
google
/
maxtext
A simple, performant and scalable Jax LLM!
Apache License 2.0
1.37k
stars
241
forks
source link
issues
Newest
Newest
Most commented
Recently updated
Oldest
Least commented
Least recently updated
https://us-python.pkg.dev/gce-ai-infra/maxtext-build-support-packages/simple/ not public
#758
emergenz
opened
6 hours ago
0
Enable quantization for MoE
#757
RissyRan
opened
20 hours ago
0
Fix decode.py to also use first_token from prefill_call
#756
vipannalla
closed
20 hours ago
0
Fix decode.py to also use first_token from prefill_call
#755
vipannalla
closed
21 hours ago
0
Fix decode.py to also use first_token from prefill_call
#754
vipannalla
closed
21 hours ago
1
Revert "Prefill return first token"
#753
RissyRan
closed
23 hours ago
0
How to implement 1F1B pipeline parallelism in Jax?
#752
MoFHeka
opened
1 day ago
0
Add moe perf number
#751
RissyRan
closed
19 hours ago
0
Fix validation error for other models
#750
RissyRan
closed
5 days ago
0
Integrate goodput monitor
#749
dipannita08
opened
6 days ago
0
Fix broken ReadMe getting started links
#748
gobbleturk
opened
6 days ago
0
Update tile size
#747
RissyRan
closed
6 days ago
0
Add convergence tests on A3 GPU
#746
michelle-yooh
closed
6 days ago
0
Fix simple test step count
#745
gobbleturk
closed
6 days ago
0
[DON'T MERGE] GCS Distributed Training Benchmark Infra + File-parallelism + Range-read Parquet files
#744
bernardhan33
opened
1 week ago
0
Install Transformer Engine from github for stable and nightly mode
#743
michelle-yooh
opened
1 week ago
0
Adding support for creating maxtext docker images with jax-ss
#742
parambole
opened
1 week ago
0
Clean up MoE brute force implementation
#741
RissyRan
closed
6 days ago
0
Implement restore with Orbax emergency checkpoint manager
#740
xuefgu
closed
6 days ago
0
Handle cases where memstats are not available for the device.
#739
lukebaumann
closed
5 days ago
0
Support eval dataset and refactor
#738
aireenmei
closed
1 week ago
0
Adding option for int4 quantization to kvcache.
#737
singh-mitali
closed
1 week ago
1
Support target masking (aka loss masking or label masking) for SFT datasets
#736
jmschndev
opened
1 week ago
0
Inconsistent code formatting
#735
jmschndev
opened
1 week ago
0
`hf_access_token` only effective for loading gated datasets, not gated tokenizers
#734
jmschndev
opened
1 week ago
0
Gemma 2 support
#733
borisdayma
opened
1 week ago
1
Fix and protect simple_layer
#732
gobbleturk
closed
1 week ago
0
Load/Save Aqt quantized checkpoint.
#731
singh-mitali
closed
1 week ago
0
Support partial overrides for logical_axis_rules.
#730
golechwierowicz
closed
3 hours ago
2
Allow owners to have any approver
#729
gobbleturk
closed
1 week ago
0
LLama3-70B model config
#728
khatwanimohit
opened
1 week ago
0
Prefill return first token
#727
jwyang-google
closed
1 week ago
0
Add Llama2 7B, 13B high performance training configs
#726
raymondzouu
closed
1 week ago
0
Enable loading and saving quantized decode checkpoint.
#725
singh-mitali
closed
1 week ago
4
Update the dependencies to prepare for integration of emergency checkpointing
#724
xuefgu
closed
2 weeks ago
0
Fix Mesh setup for multiprocess CPUs.
#723
RoshaniN
closed
2 weeks ago
0
Allow owners to have any approver
#722
gobbleturk
closed
1 week ago
0
Use dedicated requirements for GPUs
#721
xuefgu
closed
2 weeks ago
0
Enable saving using Orbax's emergency checkpoint manager
#720
xuefgu
closed
1 week ago
0
Add mistral tokenizer to maxtext/assets
#719
vipannalla
closed
2 weeks ago
0
Add mixtral tokenizer to maxtext/assets
#718
vipannalla
closed
2 weeks ago
0
Move Stage to second axis
#717
gobbleturk
closed
2 weeks ago
0
fix data loading from HF hub
#716
aireenmei
closed
2 weeks ago
0
Try a token_path relative to the base config path if it can't be found
#715
obrienadam
closed
2 weeks ago
0
Refactor permute and unpermute operations
#714
RissyRan
closed
1 week ago
0
Add maxengine_server configs to base.yml
#713
gobbleturk
closed
3 weeks ago
0
change the sweeping to prefill and ar cache only
#712
morgandu
closed
3 weeks ago
0
Eval on C4?
#711
tjingrant
closed
1 week ago
1
Explicitly specify tokenizer_path for pipeline tests
#710
gobbleturk
closed
3 weeks ago
0
Allow Different Compute Layout for Attention
#709
morgandu
closed
3 weeks ago
0
Next