The outputs of the infer.py is <s><s><s><s><s><s><s><s>...

weiyifan1023 / Neeko

Code and Data for EMNLP 2024 Paper "Neeko: Leveraging Dynamic LoRA for Efficient Multi-Character Role-Playing Agent"

https://aclanthology.org/2024.emnlp-main.697/

Apache License 2.0

101 stars 5 forks source link

The outputs of the infer.py is <s><s><s><s><s><s><s><s>... #7

Open Zc0812 opened 1 month ago

Zc0812 commented 1 month ago

First, I executed the code according to the process in the repo and got roleembd.pth and ckpt. However, the loss in the training stage is a lot bit strange. The loss changes very little, but it increases sharply in the last stage and then drops to 0.

Second, In the infer.py stage, the outputs tensor is all 1 ().

~~So how should I solve this problem? Should I retrain the model?~~

Zc0812 commented 1 month ago

Or can you upload the ckpt in the paper?

© Githubissues.

Githubissues is a development platform for aggregating issues.