issues
search
stanford-crfm
/
levanter
Legible, Scalable, Reproducible Foundation Models with Named Tensors and Jax
https://levanter.readthedocs.io/en/latest/
Apache License 2.0
519
stars
82
forks
source link
issues
Newest
Newest
Most commented
Recently updated
Oldest
Least commented
Least recently updated
add cycle_length
#825
dlwh
closed
22 hours ago
0
Bump fsspec
#824
dlwh
closed
1 day ago
0
rename maybe_fused_next_token_loss
#823
dlwh
closed
1 day ago
0
bump jax version
#822
dlwh
closed
1 day ago
0
missed some renames?
#821
dlwh
closed
1 day ago
0
move logging and types to util to make python's module resolution hap…
#820
dlwh
closed
1 day ago
0
hijack HF's download so it works with gcs etc.
#819
dlwh
closed
1 day ago
0
Optim config drop stable and add decay
#818
blahBlahhhJ
closed
3 days ago
0
Rollback optim fix
#817
blahBlahhhJ
closed
4 days ago
0
jit and batch supervised data loading to speed it up (a lot)
#816
dlwh
closed
5 days ago
0
fix for token bug that skips EOS
#815
ahmeda14960
closed
5 days ago
0
Update test_optimizer_config.py
#814
blahBlahhhJ
closed
5 days ago
0
Fix lr schedule
#813
blahBlahhhJ
closed
5 days ago
1
Adds Short-Context Qwen Support
#812
Helw150
closed
5 days ago
0
document WSD-S stuff, add cycles
#811
dlwh
closed
6 days ago
0
cache fixes
#810
dlwh
closed
6 days ago
0
Wsds redo
#809
dlwh
closed
1 week ago
0
Update transformers requirement from <4.46.0,>=4.41.2 to >=4.41.2,<4.47.0
#808
dependabot[bot]
closed
1 day ago
1
remove unnecessary assertion in tokenization
#807
dlwh
closed
1 week ago
0
deprecate LMSupervisedDataConfig, but keep it working for now
#806
dlwh
closed
1 week ago
0
Use haliax state dict
#805
dlwh
closed
1 week ago
0
support auto hsdp
#804
blahBlahhhJ
closed
1 week ago
0
Support multiple supervisde evals, some cleanup around that
#803
dlwh
closed
1 week ago
0
Nit: typing
#802
jennifgcrl
closed
1 week ago
1
tracker.finish to deal with subprocess stuff
#801
dlwh
closed
1 week ago
0
tweaks: truncate after pad for supervised
#800
dlwh
closed
1 week ago
0
bulk delete using STS
#799
dlwh
closed
1 week ago
1
Misc fixes from sweep (disable blocked CE by default)
#798
dlwh
closed
1 week ago
1
pretty sure we just don't need scipy
#797
dlwh
closed
2 weeks ago
0
Relax restriction on scipy to be compatible with more recent numpy
#796
jennifgcrl
closed
2 weeks ago
2
Fix transformer-engine attention import
#795
jennifgcrl
closed
2 weeks ago
1
fix internal_eval lengths
#794
dlwh
closed
2 weeks ago
0
Revise SFT File
#793
ahmeda14960
closed
1 week ago
1
fix epochs in type signature, fix type checker
#792
dlwh
closed
2 weeks ago
0
prepare lora for state dict change
#791
dlwh
closed
2 weeks ago
0
Add "blocked"/"flash" cross entropy
#790
dlwh
closed
2 weeks ago
0
correct total byte calculation for bpb
#789
dlwh
closed
2 weeks ago
0
Internal eval fixes
#788
TheQuantumFractal
closed
2 weeks ago
0
Internal eval fixes
#787
TheQuantumFractal
closed
2 weeks ago
0
Internal eval fixes
#786
TheQuantumFractal
closed
2 weeks ago
0
fix wandb key
#785
dlwh
closed
2 weeks ago
0
Fix hf datasets for new version
#784
dlwh
closed
3 weeks ago
0
backport tokenize-independelty-then-copy from marin
#783
dlwh
closed
2 weeks ago
1
Fix Llama 3 Tests
#782
Helw150
closed
2 weeks ago
1
Model checkpoints should store their model configs
#781
dlwh
opened
3 weeks ago
0
Initializing of models from checkpoints/pretrained models has gotten a bit crazy
#780
dlwh
opened
3 weeks ago
0
Merging DiVA to Levanter Main
#779
Helw150
opened
3 weeks ago
4
Epochs
#778
dlwh
closed
2 weeks ago
0
Support Tied Weights in Llama Models
#777
Helw150
closed
1 month ago
0
Infra tweaks
#776
dlwh
closed
1 month ago
0
Next