janhq / ichigo

Local realtime voice AI
Apache License 2.0
2.01k stars 100 forks source link

training run: Lora Ichigo Qwen2.5 32B #134

Open bachvudinh opened 4 days ago

bachvudinh commented 4 days ago

Goal

Methodology

Experiments

Run ID Date Model Config Dataset Learning Rate Batch Size Steps Loss Hardware MMLU MMLU pro Notes
exp1-pretrain 2024-11-23 Lora-256-512 Pretrain v0.1 1.5e-4 384 6302 1.9 ~ 100 hours on 8xA6000 - - old dataset
exp1-sft 2024-11-27 Lora-256-512 SFT 3e-4 384 2500 1.2 20 hours on 8xA6000 - - stop early to prepare next run
exp2-pretrain 2024-11-26 Lora-256-512 Pretrain v0.2 1.5e-4 384 6302 Updated soon ~ 100 hours on 8xA6000 - - new dataset v0.2

Learnings

Quicklinks

dan-homebrew commented 3 days ago

Running on 8 x A6000 in Taiwan