explodinggradients / ragas

Supercharge Your LLM Application Evaluations 🚀
https://docs.ragas.io
Apache License 2.0
7.38k stars 751 forks source link

Testset generation feedback #1568

Open tuan3w opened 1 month ago

tuan3w commented 1 month ago

Hello,

I would like to open this issue to discuss tips and guidance related to test set generation. If you don't think this is the right place, feel free to close this issue.

Recently, I tested with Ragas 0.2, making some modifications, and generated tests for Vietnamese. I do very simple setup. I use PyPDFLoader to load pdf docs.

Here are some observations:

  1. It generates questions for short, unattractive contexts. Perhaps some rule-based filters could help address this issue.

image image

  1. I noticed that short, simple questions were generated even for long contexts. Tweaking the prompt slightly might improve the quality of the questions.

image

  1. I feel that the generated questions are too generic. Perhaps adding a global theme or topic could enhance the question generation process.
  2. Ultimately, I believe that having the ability to use custom prompts to control the question generation would be very beneficial.

Thanks

shahules786 commented 1 month ago

Hey @tuan3w thanks for your feedback. Can you also share how many documents you used and also describe the nature of the data? Some of these are documentation related issues, for example item 4 - you can already customise prompts using set_prompts and get_prompts method as in here but it's not documented for testset. Others would need bit more polishing. Test generation v3 is very capable, but it surely needs more polishing to make it better.

Also I just created #1577 to track feedbacks for it. Feel free to engage in conversations, and post your feedbacks there.

ableiweiss commented 1 week ago

@shahules786 can you share an example of how this would be done for testset? I use set_prompts to modify the prompt for SpecificQuerySynthesizer and verify that it changes the instruction and examples, but when I generate it doesn't have an effect.