使用ptuning里的train.sh进行微调loss一直是0

mularo commented 11 months ago

Is there an existing issue for this?

[X] I have searched the existing issues

Current Behavior

使用bash train.sh能训练起来，但是loss一直是0 ![Uploading 屏幕截图 2023-10-29 224552.png…]()

Expected Behavior

No response

Steps To Reproduce

单张4090上使用ptuning文件夹里面的train.sh进行微调，将LR调小到了1e-6,无论是设置fp16还是设置quantization_bit 8训练时loss一直是0.

Environment

- OS:ubuntu20.0.04
- Python:3.9
- Transformers:4.30.2
- PyTorch:2.0.0
- CUDA Support (`python -c "import torch; print(torch.cuda.is_available())"`) :True 版本为11.8

Anything else?

No response

huilong-chen commented 9 months ago

请问您找到原因了吗？

mularo commented 8 months ago

嗯嗯解决了，时间较久我才想起来怎么解决的，是因为我一开始是部署在windows11系统下微调的，后来在Ubuntu20.04系统下重新配置了环境微调就没有这个问题了。

---- 回复的原邮件 ---- | 发件人 | @.> | | 日期 | 2024年01月09日 15:24 | | 收件人 | @.> | | 抄送至 | @.>@.> | | 主题 | Re: [THUDM/ChatGLM-6B] 使用ptuning里的train.sh进行微调loss一直是0 (Issue #1415) |

请问您找到原因了吗？

— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you authored the thread.Message ID: @.***>

yuhp-zts commented 7 months ago

请问新环境是什么样的呀

THUDM / ChatGLM-6B