A way to implement RAG evaluation using Ragas or TrueLens

SkaDS23 commented 7 months ago

Hello,

I'm actually searching for a method to evaluate the rag, is there any suggestion to make this approach using Ragas or Trulens (Directly in the user interface or even a test in cli) and specially if i'm not very familiar with the codebase ? Is there any pre-built similar solution in H2ogpt ?

Thank you

pseudotensor commented 7 months ago

There's the main gen.py but also cli.py for CLI use and eval.py for evaluation (--gradio=False) for a given collection (pre-defined or created at runtime first time with --user_path=user_path --langchain_model=UserData.

For eval, if --verifier_model is set and hosts a prometheus model or other model: https://github.com/kaistAI/prometheus . This can be used to get truthfulness etc. with these prompts in src/prompter.py. However, the prompts are not hooked up. After reviewing, prometheus is less than optimal as it can't be used for larger context models.

Best is to use same or larger model for --verifier_model for these same prompts, but again they aren't hooked up to any UI or CLI options, one would need modify h2oGPT some for that. One would need to tweak the eval.py code to (say) run over all the prompts for the verifier model and collect results.

Lastly, RAGAS or other tools could be hooked up to the eval.py code, but again it's not done.

The simplest way to handle is to use h2oGPT as OpenAI API setup to use a collection of documents, and use RAGAS to talk to it.

SkaDS23 commented 7 months ago

Thank you for this detailed reply,

I'm actually testing h2oGPT with OpenAI API setup (still not testing with local models), by mentioning "RAGAS talk to it" Can you clarify this detail further please ? If you can also guide me where i can figure out the rag pipeline in the code in order to understand it.

pseudotensor commented 7 months ago

I mean by default h2oGPT runs an OpenAI proxy server, so you can talk to h2oGPT as if it were OpenAI by setting right base_url and api key. See

https://github.com/h2oai/h2ogpt/blob/main/docs/README_CLIENT.md#openai-proxy-client-api

This includes "langchain_mode" to control which collection to talk to.

RAGAS also measures retrieval accuracy etc., so probably can't do everything it would want.

SkanderBS23 commented 7 months ago

Thank you !

I'm already using h2oGPT that way, will try to figure out how to implement RAGAS after having a good idea about the codebase (which is pretty long). Any idea about where i can find the retrieved context in the code / questions / answers .. (i assume it's coded using langchain ?), The rag pipeline aswell

h2oai / h2ogpt

A way to implement RAG evaluation using Ragas or TrueLens #1443