-
## Issue
I keep getting `nan` loss when training Llama-3.2-vision
I tried:
- gradient clipping
- lower learning rate
- higher batch size, lora rank and alpha
But with no success.
## …
-
我使用两台机器,每台机器4张显卡。运行命令:accelerate launch --dynamo_backend no --machine_rank 0 --main_process_ip 192.168.68.249 --main_process_port 27828 --mixed_precision no --multi_gpu --num_machines 2 --num_processe…
-
Hi,
Thank you very much for sharing your implementation!
I encountered an issue, and since I am new to the Diffuser library, I was hoping you could guide me on how to run these code and how to g…
-
ReConfig utilizes the library **RankLib.jar** to re-rank the original predicted ranking list outputed by the rank-based method.
However, the results shows that learning to rank model cannot improve …
-
# Welcome
Get a feel for how issue creation can work! Whether it be using default issues, issue creation through projects, or using issue templates (fun fact: that's what this is) GitHub has option…
-
# Welcome
Get a feel for how issue creation can work! Whether it be using default issues, issue creation through projects, or using issue templates (fun fact: that's what this is) GitHub has option…
-
# Welcome
Get a feel for how issue creation can work! Whether it be using default issues, issue creation through projects, or using issue templates (fun fact: that's what this is) GitHub has option…
-
# Welcome
Get a feel for how issue creation can work! Whether it be using default issues, issue creation through projects, or using issue templates (fun fact: that's what this is) GitHub has option…
em556 updated
1 month ago
-
Hello! Thank you for your work!
I have conducted a version of fine tuning training on databricks-dolly-15k according to your script Settings. When I tried to evaluate mmlu on opencompass, I found tha…
-
# Welcome
Get a feel for how issue creation can work! Whether it be using default issues, issue creation through projects, or using issue templates (fun fact: that's what this is) GitHub has option…