Prompt is getting repeated in response

I tried to retrain Llama-2 model. I just followed the steps you have mentioned. But when I am generating the text with the following code snippet -

prompt = "What is a large language model?"
pipe = pipeline(task="text-generation", model=model, tokenizer=tokenizer, max_length=200)
result = pipe(f"[INST] {prompt} [/INST]")
print(result[0]['generated_text'])

I am getting a weird response as below -

[INST] What is a large language model? [/INST]
[INST] What is a large language model? [/INST]
[INST] What is a large language model? [INST]
[INST] What is a large language model? [INST]
[INST] What is a large language model? [INST] [INST] What is a large language model? [/INST]
[INST] What is a large language model? [INST] [INST] What is a large language model? [/INST] [INST] What is a large language model? [INST] [INST] What is a large language model? [/INST] [INST] What is a large language model? [INST] [INST] What is a large language model? [/INST] [INST] What is a large language model? [INST] [INST] What is a large language model? [/INST] [INST] What is a

What could be the issue?

mlabonne / llm-course

Prompt is getting repeated in response #2