Open qsunyuan opened 4 weeks ago
Sounds interesting, let's see if this is asked by the community ! We usually check activity here 🚀 cc @Rocketknight1
Hmmn, a simple solution would be to replicate the input n
times:
output = pipeline([input_chat] * n)
However, the text generation pipeline will only handle a single input at a time, so it's basically the same as using a for loop. We'd need to refactor the pipeline a lot to make this efficient, although you can do it efficiently with lower-level generate()
calls I think!
I am so utterly confused right now. isn't the solution just
pipeline([inputs],num_return_sequences=n)
or am I missing something?
Feature request
I would like to ask if there is a way to perform iterative generation (n times) within the pipeline, specifically for models like LLMs. If this feature is not available, is there any plan to implement it in the future?
Example:
Similar GPT API
I am also aware that iterative generation can be done using a for loop, but I am wondering if there is a more efficient or optimized way to generate multiple iterations (n times) within the pipeline for models.
https://community.openai.com/t/how-does-n-parameter-work-in-chat-completions/288725
Motivation
build connection between LLM api and transformer pipeline
Your contribution
Request