epam / ai-dial-analytics-realtime

Realtime analytics server for AI DIAL. The service consumes the logs stream AI DIAL Core, analyzes the conversation and writes the analytics to InfluxDB
https://epam-rail.com
Apache License 2.0
8 stars 3 forks source link
ai-dial llm

Overview

Realtime analytics server for AI DIAL. The service consumes the logs stream AI DIAL Core, analyzes the conversation and writes the analytics to the InfluxDB.

Usage

Check the AI DIAL Core documentation to configure the way to send the logs to the instance of the realtime analytics server.

The realtime analytics server analyzes the logs stream in the realtime and writes the metric analytics to the InfluxDB with the following data:

Tag Description
model The model name for the completion request.
deployment The deployment name of the model or application for the completion request.
project_id The project ID for the completion request
language The language of the conversation detected by the messages content.
topic The topic of the conversation detected by the messages content.
title The title of the person making the request.
response_id Unique ID of this response.
Field Description
user_hash The unique hash for the user.
price The calculated price of the request.
number_request_messages The total number of messages in history for this request.
chat_id The unique ID of this convestation.
prompt_tokens The number of tokens in the prompt including conversation history and the current message
completion_tokens The number of completion tokens generated for this request

Configuration

Copy .env.example to .env and customize it for your environment.

Connection to the InfluxDB

You need to specify the connection options to the InfluxDB instance using the environment variables: Variable Description
INFLUX_URL Url to the InfluxDB to write the analytics data
INFLUX_ORG Name of the InfluxDB organization to write the analytics data
INFLUX_BUCKET Name of the bucket to write the analytics data
INFLUX_API_TOKEN InfluxDB API Token

You can follow the InfluxDB documentation to setup InfluxDB locally and acquire the required configuration parameters.

Other configuration

Also, following environment valuables can be used to configure the service behavior:

Variable Default Description
MODEL_RATES {} Specifies per-token price rates for models in JSON format
TOPIC_MODEL ./topic_model Specifies the name or path for the topic model. If the model is specified by name, it will be downloaded from, the Huggingface.
TOPIC_EMBEDDINGS_MODEL None Specifies the name or path for the embeddings model used with the topic model. If the model is specified by name, it will be downloaded from, the Huggingface. If None, the name will be used from the topic model config.

Example of the MODEL_RATES configuration:

{
    "gpt-4": {
        "unit":"token",
        "prompt_price":"0.00003",
        "completion_price":"0.00006"
    },
    "gpt-35-turbo": {
        "unit":"token",
        "prompt_price":"0.0000015",
        "completion_price":"0.000002"
    },
    "gpt-4-32k": {
        "unit":"token",
        "prompt_price":"0.00006",
        "completion_price":"0.00012"
    },
    "text-embedding-ada-002": {
        "unit":"token",
        "prompt_price":"0.0000001"
    },
    "chat-bison@001": {
        "unit":"char_without_whitespace",
        "prompt_price":"0.0000005",
        "completion_price":"0.0000005"
    }
}

Developer environment

This project uses Python>=3.11 and Poetry>=1.6.1 as a dependency manager. Check out Poetry's documentation on how to install it on your system before proceeding.

To install requirements:

poetry install

This will install all requirements for running the package, linting, formatting and tests.

Build

To build the wheel packages run:

make build

Run

To run the development server locally run:

make serve

The server will be running as http://localhost:5001

Docker

To build the docker image run:

make docker_build

To run the server locally from the docker image run:

make docker_serve

The server will be running as http://localhost:5001

Lint

Run the linting before committing:

make lint

To auto-fix formatting issues run:

make format

Test

Run unit tests locally:

make test

Clean

To remove the virtual environment and build artifacts:

make clean