deepseek官方readme，loss第二轮开始就是

datawhalechina / self-llm

《开源大模型食用指南》基于Linux环境快速部署开源大模型，更适合中国宝宝的部署教程

Apache License 2.0

6.51k stars 798 forks source link

deepseek官方readme，loss第二轮开始就是 #63

Closed liuyongjie985 closed 3 months ago

liuyongjie985 commented 4 months ago

如下图

liuyongjie985 commented 4 months ago

20240317-001054

KMnO4-zx commented 4 months ago

这是本仓库的教程嘛？如果是deepseek官方仓库的代码的话，还请去deepseek仓库提交issue

liuyongjie985 commented 4 months ago

是的呀 https://github.com/datawhalechina/self-llm/blob/master/DeepSeek/04-DeepSeek-7B-chat%20Lora%20%E5%BE%AE%E8%B0%83.ipynb

liuyongjie985 commented 4 months ago

哥哥睡了吗，如果没睡能不能跟着教程试一下，我看了一下代码没发现问题

liuyongjie985 commented 4 months ago

abc

liuyongjie985 commented 4 months ago

哥哥，我调出来了 kaf ，之前deepseek用的是官方的transformer版本，刚刚想了下教程好像提到了重新装transformer，试了下这下正常，之前inference用的transformer 37，有点害怕啊，明天重新inference下看下效果，看到你回的那么快，心情都好起来了，敬佩，祝好

KMnO4-zx commented 4 months ago

好的，数据集不同或者环境不同，包的版本不同，都有可能造成这个情况，也有可能学习率的问题，所以最好和教程一样的环境，比较容易定位问题，哈哈哈，祝你顺利完成学习！