[R-297] 🚨 Migration Help from v0.1 to v0.2

jjmachan commented 1 month ago

With V0.2 there has been some breaking changes will use this form to help everyone have a smooth transition and clear and questions 🙂 .

This will also help us improve the migration docs

_R-297

alexander-zuev commented 1 month ago

@jjmachan again, an awesome job and congratz on the 0.2.0 release!

Couple of things:

There are missing md files under Customizations -> Testset Generation
As a user of previous testset generator, I couldn't find any guide on how to transition the dataset generator from 0.1.* to 0.2

jjmachan commented 1 month ago

hey @Twist333d thanks a lot for pointing out the missing link - will fix that 🙂. But right now we don't have it well documented - points to a a 'talk-to-us' page because we were not sure how users are looking to customize it and wanted to help them on a one-off basis.

as for migration could you share a bit more about how your using testset generation today? I'll help you migrate it and then I'll add it to the docs so that it helps others as well.

alexander-zuev commented 1 month ago

Thanks for a prompt reply @jjmachan! Got it :)

With regards to my own testset geneneration needs - I currently use it to generate smaller (10-50 question) dataset for my open-source RAG app. Currently at least I am in the initial stages of implementation, and I often re-create or re-run dataset creation process so that's why I was curious about migration in general.

My previous flow:

generate a dataset
convert it to HF dataset
feed it sample by sample into evaluation loop

I've already migrated to 0.2 so seems like the current docs are sufficient, but having more guidance never hurts I would say :)

wandana commented 3 weeks ago

Hi all, the testsetgenerator works really great for our project. But we got impacted with the big changes in v0.2 Particularly, we were using the distribution: {simple: 1.0, reasoning: 0.0, multi-context:0.0} distribution for our dataset generation. It produced very good output and finished fast. I was unable to understand from the docs what the equivalent will be in the new v0.2 synthesizers version. can someone please help? My goal is to generate a dataset that finishes quickly and doesn't use a lot of LLM calls to accomplish it since we need a fairly big testset size (in 1000s). The reasoning and multi-context samples took a long time to generate and was able to leave that out using the v0.1 API. What's the equivalent to that in v0.2?

A related question - is it possible to switch back to the v0.1 API temporarily. I tried with pip uninstall, followed by pip install ragas==0.1 But get the import error: "ImportError: cannot import name 'TesetGenerationEvent' from 'ragas._analytics'" for line "from ragas.testset.generator import TestsetGenerator"

jjmachan commented 3 weeks ago

hey @wandana for your usecase I would recommend using SingleHopSpecificQuerySynthesizer(llm=llm) which is the equivalent in v0.2

I would love to hop on a call with you and help you in the transition too, clear any question you might have. I would recommend that you migrate since we have a few new features (like state, caching) coming in the following weeks 🙂

explodinggradients / ragas

[R-297] 🚨 Migration Help from v0.1 to v0.2 #1486