facebookresearch / metaseq

Repo for external large-scale work
MIT License
6.52k stars 726 forks source link

OPT Can Generate Human-level Text With Contrastive Search #332

Open yxuansu opened 2 years ago

yxuansu commented 2 years ago

Hi @suchenzang and @stephenroller,

Thanks for your effort on this incredible project!

1. Motivation:

Just would like to share that, for open-ended text generation, OPT models can produce very high-quality text with contrastive search [1].

2. Example from the original paper:

Take the conversation generation task from your paper (Figure 9 at the Appendix E) as an example. Given the prefix text,

A chat between a curious human and the Statue of Liberty.

Human: What is your name?
Statue: I am the Statue of Liberty.
Human: Where do you live?
Statue: New York City.
Human: How long have you lived there?

the generated text with OPT-175B shown in Figure 9 at the Appendix E is as following:

A chat between a curious human and the Statue of Liberty.

Human: What is your name?
Statue: I am the Statue of Liberty.
Human: Where do you live?
Statue: New York City.
Human: How long have you lived there?
Statue: I have been there since 1886.
Human: What is your job?
Statue: I was built to welcome immigrants to the United States.
Human: What is your favorite food?
Statue: I like ice cream and apples.
Human: What is your favorite color?
Statue: I like red, white, and blue.
Human: What is your favorite animal?
Statue: I like dogs.
Human: What is your favorite song?

The model gets stuck in some simple and linguistically repetitive generations.

3. Generation with contrastive search:

Below, we show the result generated by OPT-6.7B (a much smaller model compared with OPT-175B) with contrastive search.

A chat between a curious human and the Statue of Liberty.

Human: What is your name?
Statue: I am the Statue of Liberty.
Human: Where do you live?
Statue: New York City.
Human: How long have you lived there?
Statue: Since 1884.
Human: Why did you come to America?
Statue: I was given to the United States by France as a gift for helping the French during the Franco-Prussian War.
Human: What do you think of America?
Statue: I love it. It is the greatest country in the world.
Human: What’s the weather like in New York?
Statue: It is cold.
Human: Is it safe to walk around at night?
Statue: Yes. There are policemen everywhere.
Human: Do you have any children?
Statue: Not yet. My pedestal is empty.
Human: What would you like to say to people who want to immigrate to America?
Statue: Come on over. You will be happy here. We have everything you need.

We see that the generated text with contrastive search is much more diverse and interesting.

[Comparison] For a more direct comparison, the generated results with the same OPT-6.7B using greedy search and nucleus sampling (p=0.95) are:

(1) Text generated by greedy search: ``` ---------------------------------------------------------------------------------------------------- A chat between a curious human and the Statue of Liberty. Human: What is your name? Statue: I am the Statue of Liberty. Human: Where do you live? Statue: New York City. Human: How long have you lived there? Statue: I have lived here for over 100 years. Human: What do you do? Statue: I welcome people from all over the world to come to America. Human: What do you think of America? Statue: I love America. Human: What do you think of immigrants? Statue: I love immigrants. Human: What do you think of America? Statue: I love America. Human: What do you think of immigrants? Statue: I love immigrants. Human: What do you think of America? Statue: I love America. Human: What do you think of immigrants? Statue: I love immigrants. Human: What do you think of America? Statue: I love America. Human: What do you think of immigrants? Statue: I love immigrants. Human: What do you think of America? Statue: I love America. Human: What do you think of immigrants? Statue: I love immigrants. Human: What do you think of America? Statue: I love America. Human: What do you think of immigrants? Statue: I love immigrants. Human... ---------------------------------------------------------------------------------------------------- ```
(2) Text generated by nucleus sampling (p=0.95) ``` ---------------------------------------------------------------------------------------------------- A chat between a curious human and the Statue of Liberty. Human: What is your name? Statue: I am the Statue of Liberty. Human: Where do you live? Statue: New York City. Human: How long have you lived there? Statue: Since 1876. Human: Why is the Statue of Liberty guarded? Statue: Because there are many people trying to steal her. a comparison about an unexpressed thought I would also share the story of “A Humble Fear.” At a conference in New York the Dalai Lama gave a speech to the International Thinkers Congress in New York. The whole thing was recorded, and the video is quite interesting. (on a side note, I love the fact that there were some people who laughed when he described himself as a humble being… I think the video is hilarious, there is a reason why I put up the video. Because if you cannot find the humor in this you’re sadly lacking…) In the speech, the Dalai Lama compares the search for truth to searching for treasure. He says: “However there is a huge difference between being a thief and a collector. A thief simply takes things, whereas a collector looks for the beauty, even if it is just a single object.” The above quote is perhaps the most cliched Buddhist philosophy of our times. However the comparison between a collector and a thief is quite interesting. I like to think that the Buddha... ---------------------------------------------------------------------------------------------------- ```

4. Resources for replication:

For easy replication, we have open-sourced all the related codes and made them super easy-to-use. If you are interested, please find more details in our [repo] and [tutorial].

Acknowledgements:

Thank you again for making OPT accessible to the public!

Reference:

[1] Su et al., 2022 "A Contrastive Framework for Neural Text Generation", NeurIPS 2022 [paper link]

stephenroller commented 2 years ago

Super cool!

stephenroller commented 2 years ago

Did you have to do any contrastive training for it to work? Or only the contrastive search?

yxuansu commented 2 years ago

Did you have to do any contrastive training for it to work? Or only the contrastive search?

Hi @stephenroller,

No contrastive training is required, and the contrastive search can be directly applied to off-the-shelf OPT with all different sizes!

suchenzang commented 2 years ago

@yxuansu to action on this issue from our end - would you be willing to add some documentation for how to get started with this within metaseq? Ideally we would include a script here that imports simctg to run through the above as an easy-to-get-started example.

yxuansu commented 2 years ago

Hi @suchenzang,

Sure! I will look into how to adapt contrastive search with the codebase of metaseq and get back to you ASAP.