Ask question and Answer by my selft.

Use cases:

1. RAG for customer dataset

Purpose

Apart from simply chat with large language model, customer can use large language to build a question and answer system based on 3 components

1) private datasets ( industry knowledge, fresh data and large volume)
2) embedding model ( it can transfer private datasets into numeric vector for further similarity check)
3) general large language model( such as chatgpt, gemini, liama or claude and etc)

scenario 1. find out why a pizza shop order droped by look customer comments.

1 Final business report delivery

With customer feedback(private data), and embeeding text to numeric (embedding) , generate a business report by google gemini(a large language model). A nice business report was done. Here is the demo report.

note: the large language model might generate many format. It is just two of chat_completion with pretty format.

2 tools

1 PGAI docker. It perfectly combined api to large language and store raw data and vector data
2 local embedding server. it convert text to numeric number. Both gemini, openai and varies commerical cloud models support it. Considering data privacy and cost of large text embedding. we can use local embbeding model on the ollam server. In this case, we use shaw/dmeta-embedding-zh:latest.

-3 cloud large language model. We use google gemini pro 1.5

-4 embedding function
call in pgai , this function can ask model to convert text into embedding. note. as ollama api interface did not compatible with openai. we should use ollama embedding api. To do it, the ollama python library has to been installed in pgai in advance.

CREATE OR REPLACE FUNCTION public.ollama_embedding(_model text, _input text, _user text DEFAULT NULL::text, _dimensions integer DEFAULT NULL::integer)

RETURNS vector

LANGUAGE plpython3u

PARALLEL SAFE

SET search_path TO 'pg_catalog', 'pg_temp'

AS $function$

from ollama import Client

model_name = _model

user_text = _input

client = Client(host='http://172.16.14.46:11434')

embedding_dict = client.embeddings(model=_model, prompt=_input)

result = embedding_dict['embedding']

return (result)

$function$

;

-5 simple completion function call in pgai, this function can help to generate final report it has been write in another openrouter api issue #33

3 data

2.1 fake data - cusotmer feedback

CREATE TABLE public.t_embeddings_dmeta (
    id bigserial NOT NULL,
    title text NOT NULL,
    "content" text NOT NULL,
    url text NULL,
    model_name text NOT NULL,
    ntoken int4 NULL,
    nlength int4 NULL,
    embedding public.vector NOT NULL,
    CONSTRAINT t_embeddings_dmeta_pkey PRIMARY KEY (id)
);

INSERT INTO public.t_embeddings_dmeta (title,"content",url,model_name) VALUES
     ('Review pizza','The best pizza I''ve ever eaten. The sauce was so tangy!',NULL,'shaw/dmeta-embedding-zh:latest'),
     ('Review pizza','The pizza was disgusting. I think the pepperoni was made from rats.',NULL,'shaw/dmeta-embedding-zh:latest'),
     ('Review pizza','I ordered a hot-dog and was given a pizza, but I ate it anyway.',NULL,'shaw/dmeta-embedding-zh:latest'),
     ('Review pizza','I hate pineapple on pizza. It is a disgrace. Somehow, it worked well on this izza though.',NULL,'shaw/dmeta-embedding-zh:latest'),
     ('Review pizza','I ate 11 slices and threw up. The pizza was tasty in both directions.',NULL,'shaw/dmeta-embedding-zh:latest');

the data screen capture is here:

4 Process

4.1 retrive the embedding for all demo data

below code convert the text of customer feedback to pizza into embeded vector. and create a pg vector index to speed up the question . the embeded data screen capture is here:(just one record)

the sql code is here

with tmp as (

select

tt.title, tt.url, tt."content" ,'shaw/dmeta-embedding-zh:latest'::text as model_name,

ollama_embedding('shaw/dmeta-embedding-zh:latest'::text,tt."content") as embedding

from

t_demo_text as tt

)

insert into t_embeddings_dmeta (title, "content", url, model_name, embedding )

select title, content, url, model_name, embedding

from tmp
CREATE INDEX t_embeddings_dmeta_embedding_idx ON public.t_embeddings_dmeta USING ivfflat (embedding vector_cosine_ops) WITH (lists='5');

4.2 ask question and find the top3 similarity customer feedback

Here is a magic part , question converted to embedded numeric and compare with question, list the top 3 feedback The result is here The sql code is here

with

question as (

select

'why customer sells dropped？' as question,

'shaw/dmeta-embedding-zh:latest' as model

)

,emb_question as (

select question,ollama_embedding(model,question) as embedding

from question

)

select question, content , te1.embedding <-> te2.embedding as similarity

from t_embeddings_dmeta te1 cross join emb_question te2

order by te1.embedding <-> te2.embedding

limit 3

4.4 final report

the final report screen capture

send the user question and similarity cusotmer feedback to large language model(google gemini 1.5) with below prompt

select simple_chat_complete('https://openrouter.ai/api/v1',

'sk-or-v1-replace_with_you_api_key',

'google/gemini-pro-1.5',

'//1. requirements:

1.1 generate a business report to answer user question with provided data.

1.2 The report should be markdown format and less than 300 words

// 2. data

I ordered a hot-dog and was given a pizza, but I ate it anyway.

I ate 11 slices and threw up. The pizza was tasty in both directions.

I hate pineapple on pizza. It is a disgrace. Somehow, it worked well on this izza though.

// 3. user question

why pizza sells order dropped? list customer claim?

the sql code is :
```sql 
with response as (

select simple_chat_complete('https://openrouter.ai/api/v1',

'sk-or-v1-replace_with_your_api_key',

'google/gemini-pro-1.5',

'//1. requirements:

1.1 generate a business report to answer user question with provided data.

1.2 The report should be markdown format and less than 300 words

// 2. data

I ordered a hot-dog and was given a pizza, but I ate it anyway.

I ate 11 slices and threw up. The pizza was tasty in both directions.

I hate pineapple on pizza. It is a disgrace. Somehow, it worked well on this izza though.

// 3. user question

why pizza sells order dropped? list customer claim?

',

FALSE) as response_json

)

select replace(response_json['choices'][0]['message']['content']::text,'\n',chr(10)) as response_content

from response

WangYong 2024-06-26

timescale / pgai

add use case document for better present how to use pgai to combine advantage of pg and ai #37