Open yxuansu opened 2 years ago
Super cool!
Did you have to do any contrastive training for it to work? Or only the contrastive search?
Did you have to do any contrastive training for it to work? Or only the contrastive search?
Hi @stephenroller,
No contrastive training is required, and the contrastive search can be directly applied to off-the-shelf OPT with all different sizes!
@yxuansu to action on this issue from our end - would you be willing to add some documentation for how to get started with this within metaseq? Ideally we would include a script here that imports simctg to run through the above as an easy-to-get-started example.
Hi @suchenzang,
Sure! I will look into how to adapt contrastive search with the codebase of metaseq and get back to you ASAP.
Hi @suchenzang and @stephenroller,
Thanks for your effort on this incredible project!
1. Motivation:
Just would like to share that, for open-ended text generation, OPT models can produce very high-quality text with contrastive search [1].
2. Example from the original paper:
Take the conversation generation task from your paper (Figure 9 at the Appendix E) as an example. Given the prefix text,
the generated text with OPT-175B shown in Figure 9 at the Appendix E is as following:
The model gets stuck in some simple and linguistically repetitive generations.
3. Generation with contrastive search:
Below, we show the result generated by OPT-6.7B (a much smaller model compared with OPT-175B) with contrastive search.
We see that the generated text with contrastive search is much more diverse and interesting.
[Comparison] For a more direct comparison, the generated results with the same OPT-6.7B using greedy search and nucleus sampling (p=0.95) are:
(1) Text generated by greedy search:
``` ---------------------------------------------------------------------------------------------------- A chat between a curious human and the Statue of Liberty. Human: What is your name? Statue: I am the Statue of Liberty. Human: Where do you live? Statue: New York City. Human: How long have you lived there? Statue: I have lived here for over 100 years. Human: What do you do? Statue: I welcome people from all over the world to come to America. Human: What do you think of America? Statue: I love America. Human: What do you think of immigrants? Statue: I love immigrants. Human: What do you think of America? Statue: I love America. Human: What do you think of immigrants? Statue: I love immigrants. Human: What do you think of America? Statue: I love America. Human: What do you think of immigrants? Statue: I love immigrants. Human: What do you think of America? Statue: I love America. Human: What do you think of immigrants? Statue: I love immigrants. Human: What do you think of America? Statue: I love America. Human: What do you think of immigrants? Statue: I love immigrants. Human: What do you think of America? Statue: I love America. Human: What do you think of immigrants? Statue: I love immigrants. Human... ---------------------------------------------------------------------------------------------------- ```(2) Text generated by nucleus sampling (p=0.95)
``` ---------------------------------------------------------------------------------------------------- A chat between a curious human and the Statue of Liberty. Human: What is your name? Statue: I am the Statue of Liberty. Human: Where do you live? Statue: New York City. Human: How long have you lived there? Statue: Since 1876. Human: Why is the Statue of Liberty guarded? Statue: Because there are many people trying to steal her. a comparison about an unexpressed thought I would also share the story of “A Humble Fear.” At a conference in New York the Dalai Lama gave a speech to the International Thinkers Congress in New York. The whole thing was recorded, and the video is quite interesting. (on a side note, I love the fact that there were some people who laughed when he described himself as a humble being… I think the video is hilarious, there is a reason why I put up the video. Because if you cannot find the humor in this you’re sadly lacking…) In the speech, the Dalai Lama compares the search for truth to searching for treasure. He says: “However there is a huge difference between being a thief and a collector. A thief simply takes things, whereas a collector looks for the beauty, even if it is just a single object.” The above quote is perhaps the most cliched Buddhist philosophy of our times. However the comparison between a collector and a thief is quite interesting. I like to think that the Buddha... ---------------------------------------------------------------------------------------------------- ```4. Resources for replication:
For easy replication, we have open-sourced all the related codes and made them super easy-to-use. If you are interested, please find more details in our [repo] and [tutorial].
Acknowledgements:
Thank you again for making OPT accessible to the public!
Reference: