deepseek-ai / DeepSeek-LLM

DeepSeek LLM: Let there be answers
https://chat.deepseek.com/
MIT License
1.41k stars 92 forks source link

Scaling laws data #42

Open borgr opened 7 months ago

borgr commented 7 months ago

I am researching scaling laws across models and architectures among other things and was wondering if you could share the logs\training losses\val eval of the models you have ran for the scaling law experiments in DeepSeek LLM. If you have other similar losses or results it would also be interesting. It might not be super well curated, anything can be helpful. Thanks

borgr commented 7 months ago

Also the model losses from the figure, are they available somewhere?