kohjingyu / fromage

🧀 Code and models for the ICML 2023 paper "Grounding Language Models to Images for Multimodal Inputs and Outputs".
https://jykoh.com/fromage
Apache License 2.0
466 stars 34 forks source link

How does generate work? #16

Closed zhaoshitian closed 1 year ago

zhaoshitian commented 1 year ago

In the generate method of the model, I notice that you remove some tokens, according to the top_p value. I don't understand why you remove the tokens with high logits, could you give me some material about this? Appreciate it so much!

kohjingyu commented 1 year ago

Hi, we are actually removing the tokens with cumulative probability > top_p. This means that we only keep the top tokens such that their sum is equal to top_p (for example, if we set top_p = 0.9 and the first 10 tokens sum to 0.9, we will discard every token after 10 and sample only from the first 10.

This is standard nucleus sampling. Hope that makes sense!

zhaoshitian commented 1 year ago

I understand! Thank you so much!!