The data in 'data/prompt-gpt2-vocab' folder seems meaningless?

Thank you for your interest! These are placeholders for the prompt generator, and will be removed in future updates.

Specifically, the "LABEL_1" in train.source.positive is only used for text style transfer. "LABEL_1" refers to the target style label, which means positive sentiment for Yelp, and modern English for Shakespeare. In train.source.negative, you will see LABEL_0, which serves the same purpose.

The repeating "issues"'s in train.target.negative is likewise just a placeholder for generating prompts with 5 tokens.

I hope this answers your question. I'm closing this now because it's a clarification question. If you have any specific questions about your use case, please feel free to create another issue with the details, so we can assist

mingkaid / rl-prompt

The data in 'data/prompt-gpt2-vocab' folder seems meaningless? #4