Closed YuXiangLin1234 closed 1 year ago
Hi
Thank you for your feedback. I found that it was somehow related to the optimizer, so I updated a new version to enable optimizer settings.
Here is an example using flan-t5: https://colab.research.google.com/drive/1DYHt0mi6cyl8ZTMJEkMNpsSZCCvR4jM1?usp=sharing
Thank you very much!
Hello,
I used this package to fine-tune a sequence-to-sequence LM, but the prediction after PPO training are always the same with prediction before training.
The things I tried is to change the colab sample code
elon_musk_gpt.ipynb
. Change model name and fromAutoModelWithLMHead
toAutoModelForSeq2SeqLM
.When I print out decoded sentences during training, I find that the predicted sentences are changing during each iteration, but the prediction after PPO training are always the same with prediction before training. Is there anything I need to care about? Or Is this package not applicable to sequence-to-sequence LM?
Prediction before training:
Prediction during iteration:
Prediction after training : (
):