different-ai / embedbase

A dead-simple API to build LLM-powered apps
https://docs.embedbase.xyz
MIT License
494 stars 53 forks source link

[Core]: tiktoken stackoverflow #110

Open louis030195 opened 1 year ago

louis030195 commented 1 year ago

System Info

.

Reproduction

send 1000000000000000000000000 length string to add

pyo3_runtime.PanicException: called `Result::unwrap()` on an `Err` value: StackOverflow

at .encode ( [/usr/local/lib/python3.10/site-packages/tiktoken/core.py:120](https://console.cloud.google.com/debug?referrer=fromlog&file=%2Fusr%2Flocal%2Flib%2Fpython3.10%2Fsite-packages%2Ftiktoken%2Fcore.py&line=120&project=embedbase) )
at .is_too_big ( [/usr/local/lib/python3.10/site-packages/embedbase/embedding/openai.py:72](https://console.cloud.google.com/debug?referrer=fromlog&file=%2Fusr%2Flocal%2Flib%2Fpython3.10%2Fsite-packages%2Fembedbase%2Fembedding%2Fopenai.py&line=72&project=embedbase) )
at .add ( [/usr/local/lib/python3.10/site-packages/embedbase/app.py:140](https://console.cloud.google.com/debug?referrer=fromlog&file=%2Fusr%2Flocal%2Flib%2Fpython3.10%2Fsite-packages%2Fembedbase%2Fapp.py&line=140&project=embedbase) )
at .run_endpoint_function ( [/usr/local/lib/python3.10/site-packages/fastapi/routing.py:163](https://console.cloud.google.com/debug?referrer=fromlog&file=%2Fusr%2Flocal%2Flib%2Fpython3.10%2Fsite-packages%2Ffastapi%2Frouting.py&line=163&project=embedbase) )
at .app ( [/usr/local/lib/python3.10/site-packages/fastapi/routing.py:237](https://console.cloud.google.com/debug?referrer=fromlog&file=%2Fusr%2Flocal%2Flib%2Fpython3.10%2Fsite-packages%2Ffastapi%2Frouting.py&line=237&project=embedbase) )
at ._sentry_app ( [/usr/local/lib/python3.10/site-packages/sentry_sdk/integrations/fastapi.py:130](https://console.cloud.google.com/debug?referrer=fromlog&file=%2Fusr%2Flocal%2Flib%2Fpython3.10%2Fsite-packages%2Fsentry_sdk%2Fintegrations%2Ffastapi.py&line=130&project=embedbase) )
at .app ( [/usr/local/lib/python3.10/site-packages/starlette/routing.py:66](https://console.cloud.google.com/debug?referrer=fromlog&file=%2Fusr%2Flocal%2Flib%2Fpython3.10%2Fsite-packages%2Fstarlette%2Frouting.py&line=66&project=embedbase) )
at .handle ( [/usr/local/lib/python3.10/site-packages/starlette/routing.py:276](https://console.cloud.google.com/debug?referrer=fromlog&file=%2Fusr%2Flocal%2Flib%2Fpython3.10%2Fsite-packages%2Fstarlette%2Frouting.py&line=276&project=embedbase) )
at .__call__ ( [/usr/local/lib/python3.10/site-packages/starlette/routing.py:718](https://console.cloud.google.com/debug?referrer=fromlog&file=%2Fusr%2Flocal%2Flib%2Fpython3.10%2Fsite-packages%2Fstarlette%2Frouting.py&line=718&project=embedbase) )
at .__call__ ( [/usr/local/lib/python3.10/site-packages/fastapi/middleware/asyncexitstack.py:18](https://console.cloud.google.com/debug?referrer=fromlog&file=%2Fusr%2Flocal%2Flib%2Fpython3.10%2Fsite-packages%2Ffastapi%2Fmiddleware%2Fasyncexitstack.py&line=18&project=embedbase) )
at ._create_span_call ( [/usr/local/lib/python3.10/site-packages/sentry_sdk/integrations/starlette.py:130](https://console.cloud.google.com/debug?referrer=fromlog&file=%2Fusr%2Flocal%2Flib%2Fpython3.10%2Fsite-packages%2Fsentry_sdk%2Fintegrations%2Fstarlette.py&line=130&project=embedbase) )
at .__call__ ( [/usr/local/lib/python3.10/site-packages/starlette/middleware/exceptions.py:68](https://console.cloud.google.com/debug?referrer=fromlog&file=%2Fusr%2Flocal%2Flib%2Fpython3.10%2Fsite-packages%2Fstarlette%2Fmiddleware%2Fexceptions.py&line=68&project=embedbase) )
at ._create_span_call ( [/usr/local/lib/python3.10/site-packages/sentry_sdk/integrations/starlette.py:130](https://console.cloud.google.com/debug?referrer=fromlog&file=%2Fusr%2Flocal%2Flib%2Fpython3.10%2Fsite-packages%2Fsentry_sdk%2Fintegrations%2Fstarlette.py&line=130&project=embedbase) )
at ._sentry_exceptionmiddleware_call ( [/usr/local/lib/python3.10/site-packages/sentry_sdk/integrations/starlette.py:229](https://console.cloud.google.com/debug?referrer=fromlog&file=%2Fusr%2Flocal%2Flib%2Fpython3.10%2Fsite-packages%2Fsentry_sdk%2Fintegrations%2Fstarlette.py&line=229&project=embedbase) )
at .coro ( [/usr/local/lib/python3.10/site-packages/starlette/middleware/base.py:70](https://console.cloud.google.com/debug?referrer=fromlog&file=%2Fusr%2Flocal%2Flib%2Fpython3.10%2Fsite-packages%2Fstarlette%2Fmiddleware%2Fbase.py&line=70&project=embedbase) )
at .__aexit__ ( [/usr/local/lib/python3.10/site-packages/anyio/_backends/_asyncio.py:662](https://console.cloud.google.com/debug?referrer=fromlog&file=%2Fusr%2Flocal%2Flib%2Fpython3.10%2Fsite-packages%2Fanyio%2F_backends%2F_asyncio.py&line=662&project=embedbase) )
at .__call__ ( [/usr/local/lib/python3.10/site-packages/starlette/middleware/base.py:106](https://console.cloud.google.com/debug?referrer=fromlog&file=%2Fusr%2Flocal%2Flib%2Fpython3.10%2Fsite-packages%2Fstarlette%2Fmiddleware%2Fbase.py&line=106&project=embedbase) )
at ._create_span_call ( [/usr/local/lib/python3.10/site-packages/sentry_sdk/integrations/starlette.py:130](https://console.cloud.google.com/debug?referrer=fromlog&file=%2Fusr%2Flocal%2Flib%2Fpython3.10%2Fsite-packages%2Fsentry_sdk%2Fintegrations%2Fstarlette.py&line=130&project=embedbase) )
at .__call__ ( [/usr/local/lib/python3.10/site-packages/starlette/middleware/cors.py:83](https://console.cloud.google.com/debug?referrer=fromlog&file=%2Fusr%2Flocal%2Flib%2Fpython3.10%2Fsite-packages%2Fstarlette%2Fmiddleware%2Fcors.py&line=83&project=embedbase) )
at ._create_span_call ( [/usr/local/lib/python3.10/site-packages/sentry_sdk/integrations/starlette.py:130](https://console.cloud.google.com/debug?referrer=fromlog&file=%2Fusr%2Flocal%2Flib%2Fpython3.10%2Fsite-packages%2Fsentry_sdk%2Fintegrations%2Fstarlette.py&line=130&project=embedbase) )
at .__call__ ( [/usr/local/lib/python3.10/site-packages/starlette/middleware/errors.py:162](https://console.cloud.google.com/debug?referrer=fromlog&file=%2Fusr%2Flocal%2Flib%2Fpython3.10%2Fsite-packages%2Fstarlette%2Fmiddleware%2Ferrors.py&line=162&project=embedbase) )
at ._create_span_call ( [/usr/local/lib/python3.10/site-packages/sentry_sdk/integrations/starlette.py:130](https://console.cloud.google.com/debug?referrer=fromlog&file=%2Fusr%2Flocal%2Flib%2Fpython3.10%2Fsite-packages%2Fsentry_sdk%2Fintegrations%2Fstarlette.py&line=130&project=embedbase) )
at .__call__ ( [/usr/local/lib/python3.10/site-packages/starlette/applications.py:122](https://console.cloud.google.com/debug?referrer=fromlog&file=%2Fusr%2Flocal%2Flib%2Fpython3.10%2Fsite-packages%2Fstarlette%2Fapplications.py&line=122&project=embedbase) )
at ._run_app ( [/usr/local/lib/python3.10/site-packages/sentry_sdk/integrations/asgi.py:183](https://console.cloud.google.com/debug?referrer=fromlog&file=%2Fusr%2Flocal%2Flib%2Fpython3.10%2Fsite-packages%2Fsentry_sdk%2Fintegrations%2Fasgi.py&line=183&project=embedbase) )
at ._run_asgi3 ( [/usr/local/lib/python3.10/site-packages/sentry_sdk/integrations/asgi.py:139](https://console.cloud.google.com/debug?referrer=fromlog&file=%2Fusr%2Flocal%2Flib%2Fpython3.10%2Fsite-packages%2Fsentry_sdk%2Fintegrations%2Fasgi.py&line=139&project=embedbase) )
at ._sentry_patched_asgi_app ( [/usr/local/lib/python3.10/site-packages/sentry_sdk/integrations/starlette.py:335](https://console.cloud.google.com/debug?referrer=fromlog&file=%2Fusr%2Flocal%2Flib%2Fpython3.10%2Fsite-packages%2Fsentry_sdk%2Fintegrations%2Fstarlette.py&line=335&project=embedbase) )
at .__call__ ( [/usr/local/lib/python3.10/site-packages/fastapi/applications.py:276](https://console.cloud.google.com/debug?referrer=fromlog&file=%2Fusr%2Flocal%2Flib%2Fpython3.10%2Fsite-packages%2Ffastapi%2Fapplications.py&line=276&project=embedbase) )
at .__call__ ( [/usr/local/lib/python3.10/site-packages/uvicorn/middleware/proxy_headers.py:78](https://console.cloud.google.com/debug?referrer=fromlog&file=%2Fusr%2Flocal%2Flib%2Fpython3.10%2Fsite-packages%2Fuvicorn%2Fmiddleware%2Fproxy_headers.py&line=78&project=embedbase) )
at .run_asgi ( [/usr/local/lib/python3.10/site-packages/uvicorn/protocols/http/httptools_impl.py:435](https://console.cloud.google.com/debug?referrer=fromlog&file=%2Fusr%2Flocal%2Flib%2Fpython3.10%2Fsite-packages%2Fuvicorn%2Fprotocols%2Fhttp%2Fhttptools_impl.py&line=435&project=embedbase) )

Expected behavior

simple fix: if len(s) > 10000 > then check tokens (no need to count tokens if the string is incredibly long already)