weiyifan1023 / Neeko

Code and Data for EMNLP 2024 Paper "Neeko: Leveraging Dynamic LoRA for Efficient Multi-Character Role-Playing Agent"
https://aclanthology.org/2024.emnlp-main.697/
Apache License 2.0
101 stars 5 forks source link

The outputs of the infer.py is <s><s><s><s><s><s><s><s>... #7

Open Zc0812 opened 1 month ago

Zc0812 commented 1 month ago

First, I executed the code according to the process in the repo and got roleembd.pth and ckpt. However, the loss in the training stage is a lot bit strange. The loss changes very little, but it increases sharply in the last stage and then drops to 0. image

Second, In the infer.py stage, the outputs tensor is all 1 (). image image

So how should I solve this problem? Should I retrain the model?

Zc0812 commented 1 month ago

Or can you upload the ckpt in the paper?