Closed sayli-ds closed 10 months ago
Hi sayli-ds: We do not support this type of behavior by default. If you would like to output just the generated text, you can do something like the following:
generated_sequences = [tokenizer.decode(seq[input_ids.shape[1]:]) for seq in generated_sequences]
Please let us know if this addresses the behavior you are looking for.
please feel free to re-open the issue if you need more help.
prompt = '''Translate English to French:
starfish => étoile de mer campfire => feu de camp snowflake => flocon de neige dragonfly => libellule maple tree => érable thunderstorm => orage seashell => coquillage waterfall => cascade hummingbird => colibri pine cone => pomme de pin lighthouse => phare dandelion => pissenlit cheese => ''' input_ids = tokenizer.encode(prompt, return_tensors="pt")
run inference
with torch.inference_mode(): start = time.time() generated_sequences = neuron_model.sample(input_ids, temperature= 0.1, sequence_length=200, top_p=0.9) elapsed = time.time() - start
generated_sequences = [tokenizer.decode(seq) for seq in generated_sequences] print(f'generated sequences {generated_sequences} in {elapsed} seconds')
I am expecting the translation of cheese (fromage) as the output. But instead getting the entire prompt as output.
What is the parallel parameter in neuron for return_full_text=False etc? This prompt works well in llama playground but not on neuron. I don't want to generate paragraphs in the output, instead looking to use this for text extraction task.