openai / tiktoken

tiktoken is a fast BPE tokeniser for use with OpenAI's models.
MIT License
11.76k stars 801 forks source link

ImportError: PyO3 modules may only be initialized once per interpreter process #141

Closed jackyhzzj closed 9 months ago

jackyhzzj commented 1 year ago

I use the Ubuntu server, Nginx and uWSGI with Flask. When I import tiktoken, it raises an exception 'ImportError: PyO3 modules may only be initialized once per interpreter process'. with Python 3.10.6 (main, Mar 10 2023, 10:55:28) [GCC 11.3.0] on linux.

hauntsaninja commented 1 year ago

Duplicate of #126. Do you have a way for me to reproduce the problem?

jackyhzzj commented 1 year ago

1、 Ubuntu 22.04 x64, nginx 1.18.0, Python 3.10.6, uwsgi 2.0.21, Flask 2.3.1, tiktoken 0.4

2、Using uwsgi to start a flask app, I configured the uwsgi command as a system service. An error occurred when this system service was started. In the flask app code, besides the framework code, only one line was added "import tiktoken".

Duplicate of #126. Do you have a way for me to reproduce the problem?

jackyhzzj commented 1 year ago

The exception only occurs when the code runs as a Flask app. When the same code runs as a standalone script, there is no issue and everything works fine.

Miuler commented 1 year ago

Can you post a very basic example code? so that I can replicate it?

Miuler commented 1 year ago

The exception only occurs when the code runs as a Flask app. When the same code runs as a standalone script, there is no issue and everything works fine.

This error occurs with pure flask? independent of uwsgi?

AA-Turner commented 1 year ago

Simple reproducer is:

import importlib, sys
orig_modules = frozenset(sys.modules)
module = importlib.import_module('tiktoken')
for m in [m for m in sys.modules if m not in orig_modules]:
    sys.modules.pop(m)
module = importlib.import_module('tiktoken')
# ImportError: PyO3 modules may only be initialized once per interpreter process

Running into this in https://github.com/sphinx-doc/sphinx/issues/11662 (though the abstract problem with PyO3 / extension modules, this issue just came up in searches)

A

jackyhzzj commented 1 year ago

The exception only occurs when the code runs as a Flask app. When the same code runs as a standalone script, there is no issue and everything works fine.

This error occurs with pure flask? independent of uwsgi?

Flask with uwsgi.

mangiucugna commented 1 year ago

@Miuler if you want to see a way to reproduce the issue, it's now triggered by the gradio fast reload feature starting v3.42.0

hauntsaninja commented 9 months ago

I think this got fixed upstream in PyO3 and tiktoken 0.5.2 includes the updated version of PyO3. Try upgrading and please re-open if this is still an issue with 0.5.2

mangiucugna commented 9 months ago

Hi, I installed tiktoken==0.5.2 and tried again to see if the the issue https://github.com/gradio-app/gradio/issues/5402 was fixed.

But I get the same error related to PyO3: ImportError: PyO3 modules may only be initialized once per interpreter process

hauntsaninja commented 9 months ago

Hm, so the upstream change I pulled in was https://github.com/PyO3/pyo3/pull/3446. Note that this does fix at least one manifestation of this issue:

import sys, tiktoken
del sys.modules["tiktoken._tiktoken"]
import tiktoken._tiktoken

This errors on tiktoken 0.5.1 but not 0.5.2. Could you confirm that that is the case for you as well?

hauntsaninja commented 9 months ago

Oh also note it looks like PyO3 might not have fixed this for Python 3.8 and older, although the error message in that case does specifically mention Python 3.8. The error string you posted in your last commit literally no longer appears anywhere in PyO3 v0.20.0

mangiucugna commented 9 months ago

you are right that testing tiktoken alone does not yield any error, then it's probably some other module imported (I suspect orjson) that has not upgraded to the latest pyo3.