chrisociepa allamo issues - Githubissues

chrisociepa / allamo

Simple, hackable and fast implementation for training/finetuning medium-sized LLaMA-based models

MIT License

152 stars 15 forks source link

issues

Newest

Newest Most commented Recently updated Oldest Least commented Least recently updated

Add support for FSDP2 and (experimental) TP

#16 chrisociepa closed 1 month ago
0
Major refactor

#15 chrisociepa closed 2 months ago
0
Add log-probability metrics to DPO training

#14 djstrong closed 3 months ago
0
more dpo logs

#13 djstrong closed 3 months ago
0
dpo fix

#12 djstrong closed 3 months ago
0
Optimize SFT dataset packing: correct RoPE encoding and without cross-contamination

#11 chrisociepa closed 3 months ago
0
Add support for weighted token-level loss and adaptive learning rate during masked token instruction training

#10 chrisociepa closed 7 months ago
0
Bulk update with new features and improvements

#9 chrisociepa closed 1 year ago
0
Consider adding support for openlm-research/open_llama_7b and _3b

#8 RDouglasSharp closed 8 months ago
1
Import_llama_weights.py code bug

#7 RDouglasSharp closed 1 year ago
1
Sample json file mentioned in README.md is not part of repository

#6 RDouglasSharp closed 1 year ago
1
[feature request] usage of trained model in python script

#5 phineas-pta closed 1 year ago
2
text classification?

#4 arpitest closed 1 year ago
2
finetune for chatbot

#3 wallon-ai closed 1 year ago
2
Where to find all files mentioned?

#2 qizzzh closed 1 year ago
1
train.txt format

#1 AAnirudh07 closed 1 year ago
2