explodinggradients / ragas

Evaluation framework for your Retrieval Augmented Generation (RAG) pipelines
https://docs.ragas.io
Apache License 2.0
6.56k stars 643 forks source link

Integrating third-party LLMs for Evaluating Chinese-native RAGs #1188

Open hurenjun opened 1 month ago

hurenjun commented 1 month ago

Hi there,

Thank you for bringing the elegant RAG Assessment framework to the community.

I am an AI engineer from Alibaba Cloud, and our team has been fine-tuning LLM-as-a-Judge models based on the Qwen foundation LLMs. Through extensive optimizations, our latest model has achieved GPT-4 level alignment with human preferences (indeed, it's performing approximately 5% better on our benchmarks) and it is particularly optimized for Chinese language support.

We are very interested in integrating our model as an evaluation LLM within RAGAS. Additionally, we would be happy to support the use of LLM hosted on Alibaba Cloud's LLM serving platform, EAS, as extension to the current support of AWA, Azure, and Google Vertex AI.

Please let me know if these contributions could be included in RAGAS.

I look forward to your response.

Best regards, Renjun

jjmachan commented 1 month ago

@hurenjun would love to explore this further too? what would be the best way forward you have in mind?

would the models be open-sourced by the way? I think it would be really helpful for the chinese userbase because we had a few model that were not supported/working as expected.

hurenjun commented 1 month ago

Hi @jjmachan, we could offer two methods, following the existing practices of ragas, for integrating our models:

  1. We have supported users in accessing our judge models via an API key. We can integrate this approach into RAGAS to facilitate quick access for users, similar to the current OpenAI/Together model.
  1. We will enable users to deploy our latest judge models on their own resources within Alibaba Cloud. We can assist in implementing support for Alibaba Cloud's EAS as an addition to the current integrated LLM services from Azure, AWS, and Vertex.

In addition, we will continue to iterate on our models and will open-source them as part of our contribution to the community. Currently, we have a paper under double-blind review. I will figure out the best way to do this without violating the anonymity requirement.

hurenjun commented 3 weeks ago

@jjmachan Hi, is there any feedback on the above proposal?

landhu commented 3 weeks ago

@hurenjun I think you can check https://docs.ragas.io/en/stable/howtos/customisations/bring-your-own-llm-or-embs.html Also, aliyun API compatible with Openai. you could replace base_url.

hurenjun commented 3 weeks ago

@landhu Thank you for your comment. That should work. And I am considering that ragas could maintain some third-party LLMs for evaluation and other purposes, like those on huggingface, in addition to the OpenAI models.

jjmachan commented 1 week ago

hey @hurenjun sorry about the delay but we are working on #1237 which will have tabs for popular LLM providers in the getting started page. This should make it easier to use.

In the mean time we can maybe write a specific notebook in the how to guides as well if that would help users too - what do you think?

hurenjun commented 1 week ago

@jjmachan That's gonna be very helpful, especially for developers in regions without directyt access to openai models. Looking forward to the new feature.

jjmachan commented 1 week ago

will keep you posted 🙂