issues
search
apple
/
axlearn
An Extensible Deep Learning Library
Apache License 2.0
1.88k
stars
269
forks
source link
issues
Newest
Newest
Most commented
Recently updated
Oldest
Least commented
Least recently updated
Version Conflict Between Torch and JAX for NVIDIA cuDNN-cu12
#858
apivovarov
opened
3 hours ago
1
[Q] Resolving CUDA OOM Errors When Running run_tests.sh on GPUs
#857
apivovarov
opened
8 hours ago
0
[Bug Fix] Fix a nit issue for Jax rollback
#856
kelvin-zou
closed
1 day ago
0
[Bug fix] Jax rollback for TPU
#855
kelvin-zou
closed
3 days ago
0
Rollback to Jax 0.4.33
#854
kelvin-zou
closed
4 days ago
0
Implement ConvXDTranspose
#853
ds-hwang
closed
14 hours ago
2
Fix health check after https://github.com/apple/axlearn/pull/848
#852
hanzhi713
opened
4 days ago
0
improve GCS perf: Change resource limit to request
#851
samos123
opened
6 days ago
0
Add llama 3 tokenizer
#850
sychen52
opened
6 days ago
0
Fix RLHF slowdown in attention multi steps extend_step.
#849
ds-hwang
closed
6 days ago
1
Generalizes pre- and post-spmd setup with init modules.
#848
markblee
closed
6 days ago
0
Conv1D supports paddings.
#847
ds-hwang
closed
1 week ago
1
Ensure iterators are saved in per-process dir.
#846
markblee
closed
1 week ago
0
Optimize TPU Flash Attention (400x speed-up on 32k long context)
#845
ds-hwang
opened
1 week ago
0
Hardcode metadata.google.internal ip address to avoid transient DNS resolution issue
#844
Ethanlm
closed
1 week ago
0
Minor cleanup.
#843
markblee
closed
1 week ago
0
Fix the changed GQA behavior by #837.
#842
ds-hwang
closed
1 week ago
1
CAUSAL padding=(dilate_window - stride, stride - 1), not (dilate_window - dilate_stride, dilate_stride - 1)
#841
ds-hwang
closed
1 week ago
2
bug fix issues;)
#840
zashab
opened
1 week ago
0
Add Mamab2 and its Jamba variant
#839
berlino
opened
1 week ago
0
Implement custom `max_data_shard_degree` and `shard_threshold_bytes`
#838
hanzhi713
closed
1 week ago
0
Optimize MQA computation.
#837
ds-hwang
closed
1 week ago
0
Transformer extend_step supports multi steps generation (2/2).
#836
ds-hwang
closed
1 week ago
2
Remove stale moveaxis optimization in attention.
#835
ds-hwang
closed
1 week ago
1
Updates orbax and adds support for max save/restore concurrent gb.
#834
markblee
closed
1 week ago
0
A skeleton of shuffling tensorstore_specs for serialization/deserialization.
#833
ruomingp
closed
1 week ago
1
Implement sequence_mask().
#832
ds-hwang
closed
1 week ago
1
Transformer extend_step supports multi steps generation.
#831
ds-hwang
closed
1 week ago
2
Skip dst dir creation if no tf savables.
#830
markblee
closed
1 week ago
0
Removes legacy bias check for flash attention.
#829
markblee
closed
1 week ago
0
[GKE]: support priority class
#828
zhzhyi
closed
2 weeks ago
0
Add bf16 test to subsampler.
#827
ds-hwang
closed
2 weeks ago
1
Quantizer returns ids as int32, not float32.
#826
ds-hwang
closed
2 weeks ago
1
Remove cleanup on save.
#825
markblee
closed
2 weeks ago
0
Introduce `model_analysis.txt` in trainer.
#824
ds-hwang
closed
2 weeks ago
0
Quantizer does not return one-hot vectors.
#823
ds-hwang
closed
2 weeks ago
0
StackOverTime's partial frame is treated as a valid frame, similar to convolution padding.
#822
ds-hwang
closed
2 weeks ago
0
Configurable data_merger for StackedTransformerLayer
#821
qdavid1
closed
2 weeks ago
0
Integrate Orbax's emergency checkpoint.
#820
hanzhi713
opened
2 weeks ago
0
fix: Instantiate logits_modifier config in sample_decode
#819
changlan
closed
2 weeks ago
0
Speed up FA Backward pass in GPU via parallelizing sequence dimension
#818
kelvin-zou
closed
2 weeks ago
0
upgrade to jax 0.4.34
#817
matthew-e-hopkins
closed
2 weeks ago
0
Support job in-place update
#816
Ethanlm
closed
2 weeks ago
0
`check_numerics` doesn't work inside `repeat.py`
#815
ds-hwang
opened
2 weeks ago
0
Strict job names
#814
matthew-e-hopkins
closed
2 weeks ago
0
Enable autotune ram for axlearn dataset
#813
WeersProductions
closed
2 weeks ago
0
Fix flash attention layer test
#812
rhodes73
closed
2 weeks ago
0
Fix tfds autotune
#811
WeersProductions
closed
3 weeks ago
0
Fix Regression from https://github.com/apple/axlearn/pull/737
#810
jiarui-lu2
closed
3 weeks ago
0
[Bug fix] Update gpu flash attention after syntax change and fixed unit tests for flash attention
#809
kelvin-zou
closed
2 weeks ago
0
Next