kipgparker / soft-prompt-tuning

MIT License
339 stars 44 forks source link

question on training loss #9

Open zluw1117 opened 2 years ago

zluw1117 commented 2 years ago

Thank you for sharing this great work. I am using a similar code and added soft prompt tuning to encoder. However, my training loss is super strange, I keep getting training loss > 40 (the regular training loss should be small for my case, usually >0.001). Did you have the same issue? Thanks 🙏

albertbn commented 1 year ago

hey, I was wondering if that's all it takes to be able to train a model with soft prompt. I mean the example.ipynb in this repo. So once I have the SoftEmbedding set via model.set_input_embeddings(s_wte) and I pad the input_ids and attention_mask from then on it's regular training? I mean - in the training/eval loop I just pad the inputs and that's all

Can someone provide a brief example of the prep needed for fine tuning (with freezing the base model) and saving and reloading the trained SoftEmbedding

If it's that simple, how come Huggingface's implementation for PEFT even just for the soft prompt is large and overwhelming?

BTW skipping the padding (mentioned by someone else) doesn't work for me in the example.ipynb. My HF transformers version is transformers==4.25.1