stanford-crfm levanter issues

stanford-crfm / levanter

Legible, Scalable, Reproducible Foundation Models with Named Tensors and Jax

https://levanter.readthedocs.io/en/latest/

Apache License 2.0

519 stars 82 forks source link

issues

Newest

Newest Most commented Recently updated Oldest Least commented Least recently updated

add cycle_length

#825 dlwh closed 22 hours ago
0
Bump fsspec

#824 dlwh closed 1 day ago
0
rename maybe_fused_next_token_loss

#823 dlwh closed 1 day ago
0
bump jax version

#822 dlwh closed 1 day ago
0
missed some renames?

#821 dlwh closed 1 day ago
0
move logging and types to util to make python's module resolution hap…

#820 dlwh closed 1 day ago
0
hijack HF's download so it works with gcs etc.

#819 dlwh closed 1 day ago
0
Optim config drop stable and add decay

#818 blahBlahhhJ closed 3 days ago
0
Rollback optim fix

#817 blahBlahhhJ closed 4 days ago
0
jit and batch supervised data loading to speed it up (a lot)

#816 dlwh closed 5 days ago
0
fix for token bug that skips EOS

#815 ahmeda14960 closed 5 days ago
0
Update test_optimizer_config.py

#814 blahBlahhhJ closed 5 days ago
0
Fix lr schedule

#813 blahBlahhhJ closed 5 days ago
1
Adds Short-Context Qwen Support

#812 Helw150 closed 5 days ago
0
document WSD-S stuff, add cycles

#811 dlwh closed 6 days ago
0
cache fixes

#810 dlwh closed 6 days ago
0
Wsds redo

#809 dlwh closed 1 week ago
0
Update transformers requirement from <4.46.0,>=4.41.2 to >=4.41.2,<4.47.0

#808 dependabot[bot] closed 1 day ago
1
remove unnecessary assertion in tokenization

#807 dlwh closed 1 week ago
0
deprecate LMSupervisedDataConfig, but keep it working for now

#806 dlwh closed 1 week ago
0
Use haliax state dict

#805 dlwh closed 1 week ago
0
support auto hsdp

#804 blahBlahhhJ closed 1 week ago
0
Support multiple supervisde evals, some cleanup around that

#803 dlwh closed 1 week ago
0
Nit: typing

#802 jennifgcrl closed 1 week ago
1
tracker.finish to deal with subprocess stuff

#801 dlwh closed 1 week ago
0
tweaks: truncate after pad for supervised

#800 dlwh closed 1 week ago
0
bulk delete using STS

#799 dlwh closed 1 week ago
1
Misc fixes from sweep (disable blocked CE by default)

#798 dlwh closed 1 week ago
1
pretty sure we just don't need scipy

#797 dlwh closed 2 weeks ago
0
Relax restriction on scipy to be compatible with more recent numpy

#796 jennifgcrl closed 2 weeks ago
2
Fix transformer-engine attention import

#795 jennifgcrl closed 2 weeks ago
1
fix internal_eval lengths

#794 dlwh closed 2 weeks ago
0
Revise SFT File

#793 ahmeda14960 closed 1 week ago
1
fix epochs in type signature, fix type checker

#792 dlwh closed 2 weeks ago
0
prepare lora for state dict change

#791 dlwh closed 2 weeks ago
0
Add "blocked"/"flash" cross entropy

#790 dlwh closed 2 weeks ago
0
correct total byte calculation for bpb

#789 dlwh closed 2 weeks ago
0
Internal eval fixes

#788 TheQuantumFractal closed 2 weeks ago
0
Internal eval fixes

#787 TheQuantumFractal closed 2 weeks ago
0
Internal eval fixes

#786 TheQuantumFractal closed 2 weeks ago
0
fix wandb key

#785 dlwh closed 2 weeks ago
0
Fix hf datasets for new version

#784 dlwh closed 3 weeks ago
0
backport tokenize-independelty-then-copy from marin

#783 dlwh closed 2 weeks ago
1
Fix Llama 3 Tests

#782 Helw150 closed 2 weeks ago
1
Model checkpoints should store their model configs

#781 dlwh opened 3 weeks ago
0
Initializing of models from checkpoints/pretrained models has gotten a bit crazy

#780 dlwh opened 3 weeks ago
0
Merging DiVA to Levanter Main

#779 Helw150 opened 3 weeks ago
4
Epochs

#778 dlwh closed 2 weeks ago
0
Support Tied Weights in Llama Models

#777 Helw150 closed 1 month ago
0
Infra tweaks

#776 dlwh closed 1 month ago
0