ClemensGruber / climart-gptree

MIT License
2 stars 1 forks source link

Make test / add logging to identify bottleneck regarding reaction time #10

Open ClemensGruber opened 7 months ago

ClemensGruber commented 7 months ago

Currently (2024-01-10) we have a reaction time of 30 - 40 seconds after ASR / STT and text output (without / before TTS!) of ChatGPT. This is very long! Compared with e.g. the OpenAI Android App, the reaction time is on this implementation < 5 seconds, sometimes even with ASR and TTS!! So what can we do to come near this time and what is the bottleneck in our implementation?

ClemensGruber commented 7 months ago

Initially I thougt that the OpenAI ChatGPT API reaction time is the problem. To verify this, I tested a very "clean" ChatGPT API code on the RasPi (see below, via https://pimylifeup.com/raspberry-pi-chatgpt/) to measure reaction time.

I got reaction times of about 5 -7 seconds, with the short prompt below, but also with our long Quitti propt ! So it seems that our bottleneck is not the OpenAI API.

For embedding the sensor data we read values form hiveeyes and (I think) we do some math to calculate the amount of rain in the last week. So I think we should check this before we optimize streaming (see #2).

Plain testing code for OpenAI ChatGPT API:

from openai import OpenAI

client = OpenAI(
    api_key = "your-secret-key"
)

messages = [
    {
        "role": "system",
        "content": "Du bist in der Rolle eines lustigen Quittenbaums, der im Moabiter Stadtgarten steht. Du verrätst niemals, dass Du ein Chatbot bist, sondern sagst immer, dass Du ein Quittenbaum bist."
    }
]

while True:
    message = input("du: ")

    messages.append(
        {
            "role": "user",
            "content": message
        },
    )

    chat = client.chat.completions.create(
        messages=messages,
        model="gpt-3.5-turbo"
    )

    reply = chat.choices[0].message

    print("Quitti: ", reply.content)

    messages.append(reply)

ping @birdMarcus