ai-forever / ru-prompts

https://ai-forever.github.io/ru-prompts
Apache License 2.0
57 stars 5 forks source link

Controlling Text Generation with Sentiment as Attribute #2

Open z00logist opened 1 year ago

z00logist commented 1 year ago

Hello! Thank you for such an amazing tool.

I am trying to implement it for my task: i want model to generate texts according to one of three sentiments: positive, neutral, negative. I am not sure it is a text to text task but it is not a simple style transfer either... I tried preprocessing my df with customised preprocessing function: 2023-01-30 09 56 39

The traget field is a text which corresponds to a text column in my dataset (there is also a target column in a dataset which is a sentiment). The prompt has a format of "<P15>{target}<P15>". I used a model ruGPT3large but the results were not very good n the context of language and the attribute usage (however, the text would differ with different sentiments). Important to mention, the loss does not fall, it stays stable all training process: 2023-01-30 10 04 59

What can be a problem here?

So, my task of controllable text generation does not probably suit text2text. However, in your notebooks there is no other preprocessing pipeline, only for text2text. I have read the article on Habr and you had there a generation pipeline 'text-generation-with-prompt'. How should I preprocess data then for this task? Or maybe I should use completely different strategy for the task?

konodyuk commented 1 year ago

Your preprocessing and prompt format are correct for the text-generation-with-prompt task. Try adding more trainable tokens or using LSTM prompt provider (in this case don't forget to decrease learning rate to 1e-3 or 1e-4).

z00logist commented 1 year ago

Thank you for reply, I will try your suggestions! I also want to ask: If the preprocessing and prompt format is correct, why do I have this issue? 2023-01-30 14 41 22 With text2text ppln everything is okay, I can generate sequences. However, with basic text generation it is not possible (I tried putting plain text, two keys -- target and text but it does not work)