grafana / grafana-openai-monitoring

Python and NodeJS packages that monitors OpenAI API calls and send OpenAI usage metrics and logs to Grafana Cloud
GNU General Public License v3.0
20 stars 4 forks source link

OpenAI async support #15

Open bcordo opened 8 months ago

bcordo commented 8 months ago

@ishanjainn

What happened? I'm following this guide: https://grafana.com/blog/2023/11/02/monitor-your-openai-usage-with-grafana-cloud/ and it works great with normal synchronous OpenAI client "from openai import OpenAI; client = OpenAI(apikey=OPENAIAPIKEY)" but it doesn't work with the asynchronous client "from openai import AsyncOpenAI; aclient = AsyncOpenAI(apikey=OPENAIAPIKEY)"

I get the error: "Exception has occurred: AttributeError 'coroutine' object has no attribute 'usage' File ..., in asynccall_chatgpt response = await aclient.chat.completions.create( ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "", line 1, in AttributeError: 'coroutine' object has no attribute 'usage'"

It has to do with the fact that internally it's not using await and so it's not expecting a coroutine.

What was expected to happen? I was expecting it to also work, or to have access to an async version of the monitoring code.

Steps to reproduce the problem:

  1. Run from openai import AsyncOpenAI aclient = AsyncOpenAI(apikey=OPENAIAPIKEY) aclient.chat.completions.create = chatv2.monitor( aclient.chat.completions.create, metricsurl="[redacted]", metricsusername=[redacted], logsusername=[redacted], access_token="[redacted]"
  2. Run : response = await aclient.chat.completions.create( model=model, temperature=temperature, maxtokens=maxtokens, n=maxresponses, topp=topp, frequencypenalty=frequencypenalty, presencepenalty=presence_penalty, messages=messages, stream=stream, )
  3. Get error: Exception has occurred: AttributeError 'coroutine' object has no attribute 'usage' File ... in asynccall_chatgpt response = await aclient.chat.completions.create( ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "", line 1, in AttributeError: 'coroutine' object has no attribute 'usage'

Version numbers (grafana, prometheus, graphite, plugins, operating system, etc.): Mac, grafana cloud, OpenAI plugin

ishanjainn commented 8 months ago

Hey @bcordo , Thanks for raising this issue. I have added support for AsyncOpenAI, It should be available in the python library version 0.0.8. To use with AsyncOpenAI, You'll now need to pass a flag to the function - use_async and set its value as True. Here's a snippet

from openai import OpenAI
import asyncio
from grafana_openai_monitoring import chat_v2

client = OpenAI(
    api_key="sk-***",
)

# Apply the custom decorator to the OpenAI API function
client.chat.completions.create = chat_v2.monitor(
    client.chat.completions.create,
    metrics_url="https://prometheus.grafana.net/api/prom",  # Example: "https://prometheus.grafana.net/api/prom"
    logs_url="https://logs.grafana.net/loki/api/v1/push",  # Example: "https://logs.example.com/loki/api/v1/push/"
    metrics_username=123456,  # Example: "123456"
    logs_username=987654,  # Example: "987654"
    access_token="glc_ey....",
    use_async=True,  # Set to True if the function is async
)

async def main() -> None:
    chat_completion = await client.chat.completions.create(
        messages=[
            {
                "role": "user",
                "content": "Say this is a test",
            }
        ],
        model="gpt-3.5-turbo",
    )

    print(chat_completion)

asyncio.run(main())

Please feel free to reopen the issue if you still encounter issues

bcordo commented 8 months ago

Wow @ishanjainn that was super fast! I appreciate it, you're awesome. Thanks.

bcordo commented 8 months ago

Hi @ishanjainn. Thanks for your fast response. I ran your sample code:

import os
from openai import OpenAI
import asyncio
from grafana_openai_monitoring import chat_v2
from dotenv import load_dotenv, find_dotenv

load_dotenv()
OPENAI_API_KEY = os.getenv("OPENAI_API_KEY")

client = OpenAI(
    api_key=OPENAI_API_KEY,
)

load_dotenv()
OPENAI_API_KEY = os.getenv("OPENAI_API_KEY")
DEEPL_AUTH_KEY = os.getenv("DEEPL_AUTH_KEY")
GRAFANA_API_KEY = os.getenv("GRAFANA_API_KEY")
GRAFANA_METRICS_URL = os.getenv("GRAFANA_METRICS_URL")
GRAFANA_LOGS_URL = os.getenv("GRAFANA_LOGS_URL")
GRAFANA_METRICS_USERNAME = os.getenv("GRAFANA_METRICS_USERNAME")
GRAFANA_LOGS_USERNAME = os.getenv("GRAFANA_LOGS_USERNAME")

client = OpenAI(api_key=OPENAI_API_KEY)

# Apply the custom decorator to the OpenAI API client functions to measure usage
client.chat.completions.create = chat_v2.monitor(
    client.chat.completions.create,
    metrics_url=GRAFANA_METRICS_URL,
    logs_url=GRAFANA_LOGS_URL,
    metrics_username=GRAFANA_METRICS_USERNAME,
    logs_username=GRAFANA_LOGS_USERNAME,
    access_token=GRAFANA_API_KEY,
)

async def main() -> None:
    chat_completion = await client.chat.completions.create(
        messages=[
            {
                "role": "user",
                "content": "Say this is a test",
            }
        ],
        model="gpt-3.5-turbo",
    )

    print(chat_completion)

asyncio.run(main())

But I get the error:

Traceback (most recent call last):
  File "test.py", line 48, in <module>
    asyncio.run(main())
  File ".conda/lib/python3.11/asyncio/runners.py", line 190, in run
    return runner.run(main)
           ^^^^^^^^^^^^^^^^
  File ".conda/lib/python3.11/asyncio/runners.py", line 118, in run
    return self._loop.run_until_complete(task)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File ".conda/lib/python3.11/asyncio/base_events.py", line 653, in run_until_complete
    return future.result()
           ^^^^^^^^^^^^^^^
  File "test.py", line 36, in main
    chat_completion = await client.chat.completions.create(
                      ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
TypeError: object ChatCompletion can't be used in 'await' expression

I thought there may have been a typo so I reran using AsyncOpenAI instead of OpenAI:

client = AsyncOpenAI(
    api_key=OPENAI_API_KEY,
)

And I get the error:

Traceback (most recent call last):
  File "test.py", line 48, in <module>
    asyncio.run(main())
  File ".conda/lib/python3.11/asyncio/runners.py", line 190, in run
    return runner.run(main)
           ^^^^^^^^^^^^^^^^
  File ".conda/lib/python3.11/asyncio/runners.py", line 118, in run
    return self._loop.run_until_complete(task)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File ".conda/lib/python3.11/asyncio/base_events.py", line 653, in run_until_complete
    return future.result()
           ^^^^^^^^^^^^^^^
  File "test.py", line 36, in main
    chat_completion = await client.chat.completions.create(
                            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File ".conda/lib/python3.11/site-packages/grafana_openai_monitoring/chat_v2.py", line 161, in wrapper
    response.usage.prompt_tokens,
    ^^^^^^^^^^^^^^
AttributeError: 'coroutine' object has no attribute 'usage'
sys:1: RuntimeWarning: coroutine 'AsyncCompletions.create' was never awaited
python test.py
Traceback (most recent call last):
  File "test.py", line 48, in <module>
    asyncio.run(main())
  File ".conda/lib/python3.11/asyncio/runners.py", line 190, in run
    return runner.run(main)
           ^^^^^^^^^^^^^^^^
  File ".conda/lib/python3.11/asyncio/runners.py", line 118, in run
    return self._loop.run_until_complete(task)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File ".conda/lib/python3.11/asyncio/base_events.py", line 653, in run_until_complete
    return future.result()
           ^^^^^^^^^^^^^^^
  File "test.py", line 36, in main
    chat_completion = await client.chat.completions.create(
                      ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
TypeError: object ChatCompletion can't be used in 'await' expression

And just to make sure I was using the updated version, which seems right:

pip show grafana_openai_monitoring 
Name: grafana-openai-monitoring
Version: 0.0.8
Summary: Library to monitor your OpenAI usage and send metrics and logs to Grafana Cloud
Home-page: https://github.com/grafana/grafana-openai-monitoring
Author: Ishan Jain
Author-email: ishan.jain@grafana.com
License: 
Location: .conda/lib/python3.11/site-packages
Requires: requests

Maybe I am just missing something here.

ishanjainn commented 8 months ago

use_async=True

This seems to missing

bcordo commented 8 months ago

@ishanjainn Thanks again for the quick response. Yes, that worked thanks, the problem was I added use_async=True on my main script which had the error, and I see what the difference is. I need to use the streaming API, So when you add stream=True you get the error. In detail:

import os
import asyncio

from openai import OpenAI
from grafana_openai_monitoring import chat_v2
from dotenv import load_dotenv

load_dotenv()
OPENAI_API_KEY = os.getenv("OPENAI_API_KEY")

load_dotenv()
OPENAI_API_KEY = os.getenv("OPENAI_API_KEY")
DEEPL_AUTH_KEY = os.getenv("DEEPL_AUTH_KEY")
GRAFANA_API_KEY = os.getenv("GRAFANA_API_KEY")
GRAFANA_METRICS_URL = os.getenv("GRAFANA_METRICS_URL")
GRAFANA_LOGS_URL = os.getenv("GRAFANA_LOGS_URL")
GRAFANA_METRICS_USERNAME = os.getenv("GRAFANA_METRICS_USERNAME")
GRAFANA_LOGS_USERNAME = os.getenv("GRAFANA_LOGS_USERNAME")

client = OpenAI(api_key=OPENAI_API_KEY)

# Apply the custom decorator to the OpenAI API client functions to measure usage
client.chat.completions.create = chat_v2.monitor(
    client.chat.completions.create,
    metrics_url=GRAFANA_METRICS_URL,
    logs_url=GRAFANA_LOGS_URL,
    metrics_username=GRAFANA_METRICS_USERNAME,
    logs_username=GRAFANA_LOGS_USERNAME,
    access_token=GRAFANA_API_KEY,
    use_async=True,
)

async def main() -> None:
    chat_completion = await client.chat.completions.create(
        messages=[
            {
                "role": "user",
                "content": "Say this is a test",
            }
        ],
        model="gpt-3.5-turbo",
        stream=True,
    )

    for chunk in chat_completion:
        current_content = chunk.choices[0].delta.content
        print(current_content)

asyncio.run(main())
Traceback (most recent call last):
  File "<frozen runpy>", line 198, in _run_module_as_main
  File "<frozen runpy>", line 88, in _run_code
  File "test.py", line 51, in <module>
    asyncio.run(main())
  File ".conda/lib/python3.11/asyncio/runners.py", line 190, in run
    return runner.run(main)
           ^^^^^^^^^^^^^^^^
  File ".conda/lib/python3.11/asyncio/runners.py", line 118, in run
    return self._loop.run_until_complete(task)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File ".conda/lib/python3.11/asyncio/base_events.py", line 653, in run_until_complete
    return future.result()
           ^^^^^^^^^^^^^^^
  File "test.py", line 37, in main
    chat_completion = await client.chat.completions.create(
                      ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File ".conda/lib/python3.11/site-packages/grafana_openai_monitoring/chat_v2.py", line 65, in async_wrapper
    response.usage.prompt_tokens,
    ^^^^^^^^^^^^^^
AttributeError: 'Stream' object has no attribute 'usage'

I guess the problem as reported here How_to_stream_completions.ipynb

Another small drawback of streaming responses is that the response no longer includes the usage field to tell you how many tokens were consumed. After receiving and combining all of the responses, you can calculate this yourself using tiktoken.

Any good solutions to this? Thanks again.

ishanjainn commented 8 months ago

Hey @bcordo , For the streaming responses, More than token usage, The library doesnt yet supports streaming as it needs a bit different processing, We have it as an open issue right now #13. Ill see what can be done for it

For token calculation when streaming, yeah tiktoken is the way to go (Although i don't think its 100% accurate also but gives a near good estimate)

ishanjainn commented 8 months ago

Also reopened this issue while we get streaming implemented here. Thanks!

blooser commented 2 months ago

Guys, are you sure this async support is working?

I've got this error:

Traceback (most recent call last):
  File "/app/speakaura/ai/analyze.py", line 48, in analyze_feedback
    response = await ai.chat.completions.create(
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.12/site-packages/grafana_openai_monitoring/chat_v2.py", line 65, in async_wrapper
    response.usage.prompt_tokens,
    ^^^^^^^^^^^^^^
AttributeError: 'coroutine' object has no attribute 'usage'

Should'nt this asyncio_wrapper use await?

blooser commented 2 months ago

Guys, are you sure this async support is working?

I've got this error:

Traceback (most recent call last):
  File "/app/speakaura/ai/analyze.py", line 48, in analyze_feedback
    response = await ai.chat.completions.create(
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.12/site-packages/grafana_openai_monitoring/chat_v2.py", line 65, in async_wrapper
    response.usage.prompt_tokens,
    ^^^^^^^^^^^^^^
AttributeError: 'coroutine' object has no attribute 'usage'

Should'nt this asyncio_wrapper use await?

I think it should be:

async def async_wrapper(*args, **kwargs):
        start_time = time.time()
        response = await func(*args, **kwargs)
        end_time = time.time()
        duration = end_time - start_time
olivernaaris commented 1 month ago
Traceback (most recent call last):
  File "/app/speakaura/ai/analyze.py", line 48, in analyze_feedback
    response = await ai.chat.completions.create(
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.12/site-packages/grafana_openai_monitoring/chat_v2.py", line 65, in async_wrapper
    response.usage.prompt_tokens,
    ^^^^^^^^^^^^^^
AttributeError: 'coroutine' object has no attribute 'usage'

I'm having the same issue