Next, you can launch Phoenix in the background to collect trace data emitted by any LangChain application that has been instrumented with the OpenInferenceTracer. Here's how you can do it:
session = px.launch_app()
You need to set your OpenAI API key if it is not already set as an environment variable.
Now, you can build your LangChain application. This example uses a RetrievalQA chain over a pre-built index of the Arize documentation, but you can use whatever LangChain application you like.
Instantiate an OpenInferenceTracer to store your data in OpenInference format. OpenInference is an open standard for capturing and storing LLM application traces and enables production LLMapp servers to seamlessly integrate with LLM observability solutions such as Phoenix. Note that we are using the LangChainInstrumentor to add the tracer to every chain in our application but you can also add the tracer to individual chains.
You can now run your query engine and view your traces in Phoenix. Download a sample of queries commonly asked of the Arize documentation and run a few queries.
You can export your trace data as a pandas dataframe for further analysis and evaluation.
Finally, define your evaluation model and your evaluators. Evaluators are built on top of language models and prompt the LLM to assess the quality of responses, the relevance of retrieved documents, etc., and provide a quality signal even in the absence of human-labeled data.
For more details on Phoenix, LLM Tracing, and LLM Evals, you can check out the documentation.
As for the specific dependencies and libraries required for OpenInference LangChain Instrumentation integration with Arize Phoenix, they are:
arize
langchain>=0.0.334
litellm>=1.0.3
llama-index>=0.9.14
openai>=1.0.0
nltk==3.8.1
sentence-transformers==2.2.2
pydantic<2
requests
protobuf==3.20
responses
tiktoken
typing-extensions<4.6.0
httpx
respx
nest-asyncio
These dependencies are listed under the [tool.hatch.envs.default] section in the pyproject.toml file.
Regarding the current method of data collection and storage in Arize Phoenix, it uses a combination of Python dataclasses and pandas DataFrames. The data is stored in parquet files, which are a columnar storage file format optimized for use with big data processing frameworks. This method differs from the OpenInference format, which is not specified in the provided context. Therefore, a direct comparison cannot be made.
I hope this helps! If you have any further questions, feel free to ask.
🤖
Based on the context provided, here's how you can migrate Arize Phoenix to use OpenInference LangChain Instrumentation:
OpenInferenceTracer
. Here's how you can do it:You need to set your OpenAI API key if it is not already set as an environment variable.
Now, you can build your LangChain application. This example uses a
RetrievalQA
chain over a pre-built index of the Arize documentation, but you can use whatever LangChain application you like.Instantiate an
OpenInferenceTracer
to store your data in OpenInference format. OpenInference is an open standard for capturing and storing LLM application traces and enables production LLMapp servers to seamlessly integrate with LLM observability solutions such as Phoenix. Note that we are using theLangChainInstrumentor
to add the tracer to every chain in our application but you can also add the tracer to individual chains.You can now run your query engine and view your traces in Phoenix. Download a sample of queries commonly asked of the Arize documentation and run a few queries.
You can export your trace data as a pandas dataframe for further analysis and evaluation.
Finally, define your evaluation model and your evaluators. Evaluators are built on top of language models and prompt the LLM to assess the quality of responses, the relevance of retrieved documents, etc., and provide a quality signal even in the absence of human-labeled data.
For more details on Phoenix, LLM Tracing, and LLM Evals, you can check out the documentation.
As for the specific dependencies and libraries required for OpenInference LangChain Instrumentation integration with Arize Phoenix, they are:
These dependencies are listed under the
[tool.hatch.envs.default]
section in thepyproject.toml
file.Regarding the current method of data collection and storage in Arize Phoenix, it uses a combination of Python dataclasses and pandas DataFrames. The data is stored in parquet files, which are a columnar storage file format optimized for use with big data processing frameworks. This method differs from the OpenInference format, which is not specified in the provided context. Therefore, a direct comparison cannot be made.
I hope this helps! If you have any further questions, feel free to ask.
Sources
This response is meant to be useful and save you time. It is not meant to be a precise solution, but rather a starting point for your own research.
Help me be more useful! Please leave a 👍 if this is helpful and 👎 if it is irrelevant.
If you want to continue the conversation, start your reply with @dosu-bot.