Closed shadowkiller33 closed 1 year ago
Thank you for your interest. Even without supervised data (i.e., zero-shot setting), our framework can learn prompts to achieve good performance using weak supervision signals as the reward function.
For example, we applied our framework to unsupervised text style transfer, and achieve superior or competitive performance compared to a variety of training and prompting baselines (please see the screenshot below, which shows Table 4 of our ArXiv PDF as of Sep 2nd, 2022)
Let us know if you have other questions. I'm closing this issue now because it's a clarification question.
Thanks for your reply, I can see the zero-shot performance on text style transfer. Sorry for my unclear clarification, I'm curious about the zero-shot performane on text classification, which seems to be illustrated in the paper.
I'm not sure where we illustrated "zero-shot performance on text classification". Could you point to any specific section or figure?
In general, prompts learned from our method does transfer well across related datasets. For instance, we provided a learned prompt Absolutely VERY absolute VERY absolute
in examples/few-shot-classification/README.md
. This prompt was learned from the SST-2 dataset, but can transfer well to other binary sentiment classification datasets without seeing more training examples, e.g., see our test performance below:
Dataset | Accuracy (%) | Best Baseline (Accuracy %) |
---|---|---|
sst-2 | 92.7 | 89.1 |
yelp-2 | 95.0 | 93.2 |
mr | 88.0 | 86.6 |
cr | 89.6 | 87.4 |
Let us know if you have other questions.
Besides, for this type of transferred prompt, our framework supports training with hand-written examples, e.g. the same setting as 16-shot. Also, in our preliminary experiments with 1-shot, we do find several promising prompts (may not be better than 16-shot) and biased prompts (more like overfit to training examples).
Could you please talk about the performance on zero-shot setting?