karpathy / llama2.c

Inference Llama 2 in one file of pure C
MIT License
17.22k stars 2.05k forks source link

Code/script to reproduce val loss using the shared models #475

Open Alexey234432 opened 8 months ago

Alexey234432 commented 8 months ago

Hi,

does anyone know if there is a script/code to reproduce val loss using provided "*.bin" models? I've tried myself and can't get the numbers shared.

Thank you.

DavidHerel commented 7 months ago

Same issue here.

Alexey234432 commented 7 months ago

in my case loss values are slightly higher - is it the same for you? ie 1.072 for 15M model is my case is 1.0833 and 0.760 for 110M model jumps to 0.8725 @DavidHerel

Thank you

DavidHerel commented 7 months ago

Yeah, I think it was something similar to you.

I did not play with lr, warmup and dropout, so maybe more extensive hyperparams search will get us the results?