Closed mujhenahiata closed 7 months ago
@mujhenahiata You can use LiteLLM to select a model from hugging face for Eval as well. See the usage here: https://docs.arize.com/phoenix/api/evaluation-models#phoenix.experimental.evals.litellmmodel
and LiteLLM resource here https://docs.litellm.ai/docs/providers/huggingface
Here is a notebook that has an example hugging face model: https://github.com/Arize-ai/phoenix/blob/ff6df1fc01f0986357a9e20e0441a3c15697a5fa/examples/using_llamaindex_with_huggingface_models.ipynb#L4
🤖
Hey @mujhenahiata! Good to see you back here. How's your day going?
Yes, the Arize Phoenix repository does provide support for evaluating Hugging Face models, specifically Retrieval Augmented Generation (RAG). The repository includes a tutorial that demonstrates how to build a RAG pipeline and evaluate it with Phoenix Evals.
The evaluation process includes generating question-context pairs, performing retrieval evaluations, and computing retrieval metrics such as Normalized Discounted Cumulative Gain (NDCG) and precision. The results of the evaluations are then sent to Phoenix for visualization.
Here is a code snippet that installs the necessary packages and sets up the environment for the evaluation:
And here is a code snippet that evaluates the RAG system's retrieval performance:
You can find more details in the evaluate_rag.ipynb tutorial in the Arize Phoenix repository.
Please let me know if you need further assistance or have any other questions.
Sources
This response is meant to be useful and save you time. It is not meant to be a precise solution, but rather a starting point for your own research.
Help me be more useful! Please leave a 👍 if this is helpful and 👎 if it is irrelevant.
If you want to continue the conversation, start your reply with @dosu-bot.