Closed helleuch closed 10 months ago
TextStreamer
is a very simple class from transformers that only has one goal: printing out tokens as they are being generated - it's kind of separate from the generation process itself.
You can do generation like model.generate
just like in normal transformers
.
Only if you want to use multi-step generation (i.e. where the previous outputs are added as history to new prompts), then you should use a manual generation loop that calls model(input_ids, use_cache=True, past_key_values=past_key_values)
and extracts the past_key_values
for the next inputs and extracts the logits to determine the generated token.
Thank you very much !
Hello, Thank you very much for making this work available. I would like to ask you if this works with the
transformer.pipeline
or themodel.generate
? Or do we have to use aTextStreamer
as per the example ?