Arize-ai / phoenix

AI Observability & Evaluation
https://docs.arize.com/phoenix
Other
3.05k stars 220 forks source link

Migrate to OpenInference LangChain Intrumentation #2130

Closed mikeldking closed 1 month ago

dosubot[bot] commented 5 months ago

🤖

Based on the context provided, here's how you can migrate Arize Phoenix to use OpenInference LangChain Instrumentation:

  1. First, you need to install the necessary dependencies and import the required libraries. You can do this by running the following command:
!pip install "langchain>=0.0.334" "openai>=1" arize-phoenix tiktoken nest-asyncio
  1. Next, you can launch Phoenix in the background to collect trace data emitted by any LangChain application that has been instrumented with the OpenInferenceTracer. Here's how you can do it:
session = px.launch_app()
  1. You need to set your OpenAI API key if it is not already set as an environment variable.

  2. Now, you can build your LangChain application. This example uses a RetrievalQA chain over a pre-built index of the Arize documentation, but you can use whatever LangChain application you like.

  3. Instantiate an OpenInferenceTracer to store your data in OpenInference format. OpenInference is an open standard for capturing and storing LLM application traces and enables production LLMapp servers to seamlessly integrate with LLM observability solutions such as Phoenix. Note that we are using the LangChainInstrumentor to add the tracer to every chain in our application but you can also add the tracer to individual chains.

tracer = OpenInferenceTracer()
LangChainInstrumentor(tracer=tracer).instrument()
  1. You can now run your query engine and view your traces in Phoenix. Download a sample of queries commonly asked of the Arize documentation and run a few queries.

  2. You can export your trace data as a pandas dataframe for further analysis and evaluation.

  3. Finally, define your evaluation model and your evaluators. Evaluators are built on top of language models and prompt the LLM to assess the quality of responses, the relevance of retrieved documents, etc., and provide a quality signal even in the absence of human-labeled data.

For more details on Phoenix, LLM Tracing, and LLM Evals, you can check out the documentation.

As for the specific dependencies and libraries required for OpenInference LangChain Instrumentation integration with Arize Phoenix, they are:

  1. arize
  2. langchain>=0.0.334
  3. litellm>=1.0.3
  4. llama-index>=0.9.14
  5. openai>=1.0.0
  6. nltk==3.8.1
  7. sentence-transformers==2.2.2
  8. pydantic<2
  9. requests
  10. protobuf==3.20
  11. responses
  12. tiktoken
  13. typing-extensions<4.6.0
  14. httpx
  15. respx
  16. nest-asyncio

These dependencies are listed under the [tool.hatch.envs.default] section in the pyproject.toml file.

Regarding the current method of data collection and storage in Arize Phoenix, it uses a combination of Python dataclasses and pandas DataFrames. The data is stored in parquet files, which are a columnar storage file format optimized for use with big data processing frameworks. This method differs from the OpenInference format, which is not specified in the provided context. Therefore, a direct comparison cannot be made.

I hope this helps! If you have any further questions, feel free to ask.

Sources


This response is meant to be useful and save you time. It is not meant to be a precise solution, but rather a starting point for your own research.

Help me be more useful! Please leave a 👍 if this is helpful and 👎 if it is irrelevant.

If you want to continue the conversation, start your reply with @dosu-bot.