Open Nandakishore-Thekkadathu opened 3 weeks ago
hey @Nandakishore-Thekkadathu thank you for raising this issue 🙂
this is a hard one to debug because llama 3 8B is a smaller model that doesn't have enough params to give useful results. Any chance you can use some other models? Like the bigger ones (gpt4, claude etc?)
The testset that I generated using llama 3 8B instruct model has problematic output. Unnecessary phrases are there in the output alongside the questions. I have given an example below. 0: 'Here is a question that can be fully answered from the given context using the keyphrase "Employee Self Service":
1: 'Here is a rewritten version of the question: