Open xiaohy9 opened 3 years ago
The max_length
and min_length
are in terms of tokens, not words. As some words consist of multiple tokens, this results in fewer words to be generated than you might expect.
@NielsRogge Thanks for the answer. It makes sense. But when are words consist of multiple tokens, can you give me some examples?
Also, would it be better for arguments (max_length, min_length) refer to number of words instead of tokens as to better control the outputs, which are natural language for human?
Running into a similar issue when using generator = pipeline('text-generation', model='EleutherAI/gpt-neo-2.7B')
. I can get better control when using min_length=..,max_length=..
but I have no ultimate control when e.g. querying for Below is the code for a react app with a blue button that says 'click me'
{'generated_text': "Below is the code for a react app with a blue button that says 'click me' that is to be used by react-router. \nimport React, { Component } from 'react';\n\nimport { Link } from 'react"}]
My result is "cut off" and I would be very happy to set a desired length of resulting words.
This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.
Please note that issues that do not follow the contributing guidelines are likely to be ignored.
Stalebots are so much an anti-quality thing :-/
Running into a similar issue when using
generator = pipeline('text-generation', model='EleutherAI/gpt-neo-2.7B')
. I can get better control when usingmin_length=..,max_length=..
but I have no ultimate control when e.g. querying forBelow is the code for a react app with a blue button that says 'click me'
{'generated_text': "Below is the code for a react app with a blue button that says 'click me' that is to be used by react-router. \nimport React, { Component } from 'react';\n\nimport { Link } from 'react"}]
My result is "cut off" and I would be very happy to set a desired length of resulting words.
Same issue for me, anyone found a solution regarding this?
This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.
Please note that issues that do not follow the contributing guidelines are likely to be ignored.
Stalebots are so much an anti-quality measure and have not been fixed
cc @patil-suraj @patrickvonplaten
@chris-aeviator - do you want to have exactly max_length
words? In this case you have to disable the eos_token_id => you should be able to just do model.generate(...., eos_token_id=None)
when are words consist of multiple tokens, can you give me some examples?
Unsure in English. In French, some words are grammatically contraction of several words. For instance: "La fierté du pays" = "The pride of the country" Where "du" is the contraction of "de le" (which literraly means: "of the"). So, one word ("du") for two tokens ("de le"). I guess you have a similar thing in English with: "I wanna go" = "I want to go" ("wanna" => 2 tokens)
I am using the pertained ctrlsum-cnndm model from transformers. I noticed that summarization text length is not exactly controlled by max_length, min_length arguments of model.generate(). Not sure why. It appears that empty spaces are included, but not sure. Please help. Thanks.
Results: max_length=100, min_length=50, actually 36 words
</s> The Eiffel Tower is 324 metres (1,063 ft) tall, about the same height as an 81-storey building. It is the tallest structure in Paris and the second tallest free-standing structure in France after the Millau Viaduct.</s>
max_length=200, min_length=100, actually 83 words
</s> The Eiffel Tower is 324 metres (1,063 ft) tall, about the same height as an 81-storey building, and the tallest structure in Paris. It was the tallest man-made structure in the world for 41 years until the Chrysler Building in New York City was finished in 1930. It is the second tallest free-standing structure in France after the Millau Viaduct, which measures 125 metres (410 ft) on each side. The tower is now taller than the Chrysler building by 5.2 metres (17 ft)</s>