timoschick / pet

This repository contains the code for "Exploiting Cloze Questions for Few-Shot Text Classification and Natural Language Inference"
https://arxiv.org/abs/2001.07676
Apache License 2.0
1.62k stars 283 forks source link

Code paper "Few-Shot Text Generation with Pattern-Exploiting Training" #34

Closed nguyentthong closed 3 years ago

nguyentthong commented 3 years ago

Dear @timoschick

I have read and found you paper "Few-Shot Text Generation with Pattern-Exploiting Training" really interesting. Can you share with us the code you use to conduct experiments in your paper?

timoschick commented 3 years ago

Can you share with us the code you use to conduct experiments in your paper?

We have used the exact code in this repository to conduct all of our experiments. Instructions on how to use the code to reproduce our results can be found in the README file. Let me know if you have any further questions on how to use the code or on how to reproduce a specific experiment :)

timoschick commented 3 years ago

Oh, I just realized that you were asking for a different paper - sorry! The text generation paper is currently under review; we'll release the source code once review is complete.

ghost commented 3 years ago

By when can I expect the source code? :)

On Thu, Jun 10, 2021, 23:04 Timo Schick @.***> wrote:

Oh, I just realized that you were asking for a different paper - sorry! The text generation paper is currently under review; we'll release the source code once review is complete.

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/timoschick/pet/issues/34#issuecomment-858822842, or unsubscribe https://github.com/notifications/unsubscribe-auth/ASA3VFYM5FGJWCSBLS34FNLTSDZTLANCNFSM45YFCKWA .

timoschick commented 3 years ago

I can't make any promises because I'm still waiting for the results of the review and I'm really busy right now, but I guess "somewhen next month" is a reasonable estimate :)

ghost commented 3 years ago

Great! Thanks. But is text generation with PET really that different from text classification? Like, for classification, maybe review sentiment analysis you use patterns like this: [Review here] I found it <MASK>. Here the verbalizer will be {'+1': ['good'], '-1': ['bad']} the <MASK> is predicted as 'good' or 'bad'? Wouldn't text generation be similar, like if you want to complete a sentence, your pattern would be even simpler: [Incomplete sentence] <MASK> The verbalizer would be every word in the English language each mapped to a label, and then it would pick the correct word from the verbalizer for the <MASK> token and the sentence will be completed...right?

npbeck commented 3 years ago

Great! Thanks. But is text generation with PET really that different from text classification? Like, for classification, maybe review sentiment analysis you use patterns like this: [Review here] I found it <MASK>. Here the verbalizer will be {'+1': ['good'], '-1': ['bad']} the <MASK> is predicted as 'good' or 'bad'? Wouldn't text generation be similar, like if you want to complete a sentence, your pattern would be even simpler: [Incomplete sentence] <MASK> The verbalizer would be every word in the English language each mapped to a label, and then it would pick the correct word from the verbalizer for the <MASK> token and the sentence will be completed...right?

@BleepLogger I am not sure whether you are aware of this, but Schick & Schütze talk about the differences between implementing PET for classification vs. text generation in this new paper, to which the code is missing (but the paper is there): "Few Short Text Generation with Pattern Exploiting"

It is probably best to just read it yourself (it's only 6 pages), but let me highlight some misconceptions about why it's almost the same:

like if you want to complete a sentence, your pattern would be even simpler: [Incomplete sentence] <MASK>

Text generation includes much more than just filling slots with one word. E.g., that's not how you summarize a text.

The verbalizer would be every word in the English language each mapped to a label

Every word in the English language? There is no finite amount of words. You would have to go with tokens, which are limited. But predicting one token is not what text generation is about. Also, there is no verbalizer in PET for text generation.

Hope you found that helpful! :) I encourage you to look at the paper!

ghost commented 3 years ago

@npbeck I found that VERY helpful! Thank you so much, I just finished reading the paper. Still waiting for the code to be released though...

ghost commented 3 years ago

@timoschick Been waiting long... Please release the code for text generation with PET...

timoschick commented 3 years ago

@thongnguyen050999 @BleepLogger @npbeck : You can now find the code for GenPET at this feature branch. It also contains a couple of new features that are not part of the arXiv preprint. Unfortunately, I didn't have time yet to update the Readme section to explain how to use PET for text generation and I'll only be able to do so in the first week of August (I'm on vacation next week), but feel free to experiment with the code already. Looking at the changes in cli.py should hopefully give you an idea of how to use GenPET. Note that this feature is still experimental, so let me know if anything does not work as expected.

timoschick commented 3 years ago

I'm closing this issue for now. If you encounter any specific problems with the implementation of GenPET, please create a new issue for it.