openai / tiktoken

tiktoken is a fast BPE tokeniser for use with OpenAI's models.
MIT License
11.61k stars 785 forks source link

Fatal Python error: Segmentation fault from `tiktoken/core.py` #161

Closed yuvalshi0 closed 1 year ago

yuvalshi0 commented 1 year ago

Hello, We started getting Fatal Python error: Segmentation fault sometimes, see full traceback:

Thread 0x00007fdff95fd640 (most recent call first):
  File "/usr/lib/python3.10/concurrent/futures/thread.py", line 81 in _worker
  File "/usr/lib/python3.10/threading.py", line 953 in run
  File "/usr/lib/python3.10/threading.py", line 1016 in _bootstrap_inner
  File "/usr/lib/python3.10/threading.py", line 973 in _bootstrap

Thread 0x00007fdffc3ff640 (most recent call first):
  File "/usr/lib/python3.10/threading.py", line 320 in wait
  File "/usr/lib/python3.10/queue.py", line 171 in get
  File "/runner/_work/******/******/.venv/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 797 in run
  File "/usr/lib/python3.10/threading.py", line 1016 in _bootstrap_inner
  File "/usr/lib/python3.10/threading.py", line 973 in _bootstrap

Current thread 0x00007fe00332e1c0 (most recent call first):
  Garbage-collecting
  File "/runner/_work/******/******/.venv/lib/python3.10/site-packages/tiktoken/core.py", line 50 in __init__
  File "/runner/_work/******/******/.venv/lib/python3.10/site-packages/tiktoken/registry.py", line 63 in get_encoding
  File "/runner/_work/******/******/.venv/lib/python3.10/site-packages/tiktoken/model.py", line 75 in encoding_for_model
  File "/runner/_work/******/******/******/services/******/******/clients/openai.py", line 22 in num_token_for_messages
  File "/runner/_work/******/******/******/services/******/******/clients/openai.py", line 54 in ask_gpt

What could be the issue?

Python: 3.10.6 OS: ubuntu 22.04LTS

hauntsaninja commented 1 year ago

Hmm, do you have a way to reproduce? The Python stacktrace isn't too helpful here, maybe you could set RUST_BACKTRACE=1 or run under gdb and see if you get something more helpful?

yuvalshi0 commented 1 year ago

It does not reproduce very often, I'll try

hauntsaninja commented 1 year ago

Feel free to re-open if you can get a native stacktrace or a repro