Open DeekshithaDPrakash opened 1 week ago
I finetuned llama2-ko-7b with LORA for Answering Questions based on the Context.
My training data was jsonl file with multiple texts:
Model was trained for 20 epochs and I am trying to inference on triton server
I am facing output text issue!⬇️
The output always generates [\n\n\n or ### or Input] after the first sentence.
I tried: "max_tokens": 30, "bad_words": ["\n\n###", "###"], "stop_words": ["\n\n###", ".", "!"], "pad_id": 2, "end_id": 2, "streaming": 1, "early_stopping": true, "temperature": 1.0, "top_k": 50, "top_p": 0.92, "no_repeat_ngram_size": 3, "eos_token_id": 2, "num_beams": 1, "do_sample": true }'
Example: "text_output":"경관계획은 실시설계를 완료하기 전에 수립해야 합니다. \n\n##\n\n \t\n\n \t\n\n \t"
Q: How can I prevent this issue during inference?
I finetuned llama2-ko-7b with LORA for Answering Questions based on the Context.
My training data was jsonl file with multiple texts:
### Instruction:\n{question}\n\n### Input:\n{context}\n\n### Response:\n{answer}." }Model was trained for 20 epochs and I am trying to inference on triton server
I am facing output text issue!⬇️
The output always generates [\n\n\n or ### or Input] after the first sentence.
I tried: "max_tokens": 30, "bad_words": ["\n\n###", "###"], "stop_words": ["\n\n###", ".", "!"], "pad_id": 2, "end_id": 2, "streaming": 1, "early_stopping": true, "temperature": 1.0, "top_k": 50, "top_p": 0.92, "no_repeat_ngram_size": 3, "eos_token_id": 2, "num_beams": 1, "do_sample": true }'
Example: "text_output":"경관계획은 실시설계를 완료하기 전에 수립해야 합니다. \n\n##\n\n \t\n\n \t\n\n \t"
Q: How can I prevent this issue during inference?