alkem-io / virtual-contributor-engine-guidance

Guidance on usage of the Alkemio platform based on GenAI
European Union Public License 1.2
3 stars 0 forks source link
genai

Proof of Concept Alkemio Gen AI driven Chatbot

Introduction

The purpose of this proof of concept is to assess what is required to create a versatile, reliable and intuitive chatbot for users to engage on Alkemio related topics. The project is not deployable as is, but should serve as valuable input for showing generative AI capabilities and help assessing what is required to embed this functionality in the platform.

Approach

Large Language Models (LLMs), have significantly improved over the recent period and are not ubiquitous and performant. This opens a lot of possibilities for their usage in different areas. OpenAI is the best known commercial provider of LLMs, but there are ample choices for LLM models, either commercial or open source. Whilst this provides options, it also creates the risk of provider lock-in.

LLMs are just one component required for the practical implementation off generative AI solutions, and many other 'building blocks' are necessary too. Langchain is a popular open source library that provides these building blocks and creates an abstraction layer, creating provider independance.

Training a LLM is prohibitatively expensive for most organisations, but for most practical implementations there is a need to incorporate organisation specific data. A common approach is to add specific context to a user question to the prompt that is submitted to the LLM. This poses a challenge, as LLMs generally only allow prompts with a finite size (typically around 4k tokens). Therefore it is important that the relevant contextual information is provided and for the that following needs to be done:

Implementation

The projects has been implemented as a container based micro-service with a RabbitMQ RPC. There is one RabbitMQ queue:

The request payload consists of json with the following structure (example for a query):

{
    "data": {
        "userId": "userID",
        "question": "What are the key Alkemio concepts?",
        "language": "UK"
    },
    "pattern": {
        "cmd": "query"
    }
}

The operation types are:

The response is published in an auto-generated, exclusive, unnamed queue.

There is a draft implementation for the interaction language of the model (this needs significant improvement). If no language code is specified, English will be assumed. Choices are: 'EN': 'English', 'US': 'English', 'UK': 'English', 'FR': 'French', 'DE': 'German', 'ES': 'Spanish', 'NL': 'Dutch', 'BG': 'Bulgarian', 'UA': "Ukranian"

*note: there is an earlier (outdated) RESTful implementation available at https://github.com/alkem-io/virtual-contributor-engine-guidance/tree/http-api

Docker

The following command can be used to build the container from the Docker CLI (default architecture is amd64, so --build-arg ARCHITECTURE=arm64 for amd64 builds): docker build --build-arg ARCHITECTURE=arm64 --no-cache -t alkemio/virtual-contributor-engine-guidance:v0.4.0 . docker build --no-cache -t alkemio/virtual-contributor-engine-guidance:v0.2.0 . The Dockerfile has some self-explanatory configuration arguments.

The following command can be used to start the container from the Docker CLI: docker run --name virtual-contributor-engine-guidance -v /dev/shm:/dev/shm --env-file .env virtual-contributor-engine-guidance where .env based on .azure-template.env Alternatively use docker-compose up -d.

with:

You can find sample values in .azure-template.env. Configure them and create .env file with the updated settings.

Python & Poetry

The project requires Python & Poetry installed. The minimum version dependencies can be found at pyproject.toml. After installing Python & Poetry:

Linux

The project requires Python 3.11 as a minimum and needs Go and Hugo installed for creating a local version of the website. See Go and Hugo documentation for installation instructions (only when running outside container)

Outstanding

The following tasks are still outstanding: