Closed jav-ed closed 1 year ago
This is expected. Since OpenLLaMA is a base model, you'll need to finetune it yourself to make it a chatbot that answers your questions. This is called instruction finetuning and is exactly what recent works like Alpaca, Vicuna and Koala did.
It gives me 3 different questions and only of them is correct. Changing the code to the following gives the correct answer. However, still 3 times. So maybe fine-tuning is not the only option?
generation_output = model.generate(
input_ids=input_ids, max_new_tokens=32, eos_token_id=tokenizer.eos_token_id)
print(tokenizer.decode(generation_output[0]))
Take a look at the used code:
This is the result:
Why do I get 3 responses? This, unfortunately, keeps on showing up – how can I get only one answer?