openai / tiktoken

tiktoken is a fast BPE tokeniser for use with OpenAI's models.
MIT License
11.61k stars 784 forks source link

Running import tiktoken results in ImportError #194

Closed carlsmedstad closed 11 months ago

carlsmedstad commented 11 months ago

Hi :wave:

I maintain the Arch User Repository package for tiktoken. I'm trying to incorporate the tests in the package build, but I cannot get them to pass:

$ PYTHONPATH=/tmp/makepkg/python-tiktoken/src/tiktoken-0.5.1/build/lib.linux-x86_64-cpython-311 python -m pytest
========================================= test session starts =========================================
platform linux -- Python 3.11.5, pytest-7.4.2, pluggy-1.3.0
rootdir: /tmp/makepkg/python-tiktoken/src/tiktoken-0.5.1
plugins: mock-3.11.1, hypothesis-6.84.3, mypy-0.10.3
collected 0 items / 5 errors

=============================================== ERRORS ================================================
_______________________________ ERROR collecting tests/test_encoding.py _______________________________
ImportError while importing test module '/tmp/makepkg/python-tiktoken/src/tiktoken-0.5.1/tests/test_encoding.py'.
Hint: make sure your test modules/packages have valid Python names.
Traceback:
/usr/lib/python3.11/importlib/__init__.py:126: in import_module
    return _bootstrap._gcd_import(name[level:], package, level)
tests/test_encoding.py:9: in <module>
    import tiktoken
tiktoken/__init__.py:2: in <module>
    from .core import Encoding as Encoding
tiktoken/core.py:9: in <module>
    from tiktoken import _tiktoken
E   ImportError: cannot import name '_tiktoken' from partially initialized module 'tiktoken' (most likely due to a circular import) (/tmp/makepkg/python-tiktoken/src/tiktoken-0.5.1/tiktoken/__init__.py)
_______________________________ ERROR collecting tests/test_helpers.py ________________________________
ImportError while importing test module '/tmp/makepkg/python-tiktoken/src/tiktoken-0.5.1/tests/test_helpers.py'.
Hint: make sure your test modules/packages have valid Python names.
Traceback:
/usr/lib/python3.11/importlib/__init__.py:126: in import_module
    return _bootstrap._gcd_import(name[level:], package, level)
tests/test_helpers.py:7: in <module>
    import tiktoken
tiktoken/__init__.py:2: in <module>
    from .core import Encoding as Encoding
tiktoken/core.py:9: in <module>
    from tiktoken import _tiktoken
E   ImportError: cannot import name '_tiktoken' from partially initialized module 'tiktoken' (most likely due to a circular import) (/tmp/makepkg/python-tiktoken/src/tiktoken-0.5.1/tiktoken/__init__.py)
_________________________________ ERROR collecting tests/test_misc.py _________________________________
ImportError while importing test module '/tmp/makepkg/python-tiktoken/src/tiktoken-0.5.1/tests/test_misc.py'.
Hint: make sure your test modules/packages have valid Python names.
Traceback:
/usr/lib/python3.11/importlib/__init__.py:126: in import_module
    return _bootstrap._gcd_import(name[level:], package, level)
tests/test_misc.py:4: in <module>
    import tiktoken
tiktoken/__init__.py:2: in <module>
    from .core import Encoding as Encoding
tiktoken/core.py:9: in <module>
    from tiktoken import _tiktoken
E   ImportError: cannot import name '_tiktoken' from partially initialized module 'tiktoken' (most likely due to a circular import) (/tmp/makepkg/python-tiktoken/src/tiktoken-0.5.1/tiktoken/__init__.py)
_______________________________ ERROR collecting tests/test_offsets.py ________________________________
ImportError while importing test module '/tmp/makepkg/python-tiktoken/src/tiktoken-0.5.1/tests/test_offsets.py'.
Hint: make sure your test modules/packages have valid Python names.
Traceback:
/usr/lib/python3.11/importlib/__init__.py:126: in import_module
    return _bootstrap._gcd_import(name[level:], package, level)
tests/test_offsets.py:7: in <module>
    import tiktoken
tiktoken/__init__.py:2: in <module>
    from .core import Encoding as Encoding
tiktoken/core.py:9: in <module>
    from tiktoken import _tiktoken
E   ImportError: cannot import name '_tiktoken' from partially initialized module 'tiktoken' (most likely due to a circular import) (/tmp/makepkg/python-tiktoken/src/tiktoken-0.5.1/tiktoken/__init__.py)
____________________________ ERROR collecting tests/test_simple_public.py _____________________________
ImportError while importing test module '/tmp/makepkg/python-tiktoken/src/tiktoken-0.5.1/tests/test_simple_public.py'.
Hint: make sure your test modules/packages have valid Python names.
Traceback:
/usr/lib/python3.11/importlib/__init__.py:126: in import_module
    return _bootstrap._gcd_import(name[level:], package, level)
tests/test_simple_public.py:4: in <module>
    import tiktoken
tiktoken/__init__.py:2: in <module>
    from .core import Encoding as Encoding
tiktoken/core.py:9: in <module>
    from tiktoken import _tiktoken
E   ImportError: cannot import name '_tiktoken' from partially initialized module 'tiktoken' (most likely due to a circular import) (/tmp/makepkg/python-tiktoken/src/tiktoken-0.5.1/tiktoken/__init__.py)
======================================= short test summary info =======================================
ERROR tests/test_encoding.py
ERROR tests/test_helpers.py
ERROR tests/test_misc.py
ERROR tests/test_offsets.py
ERROR tests/test_simple_public.py
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! Interrupted: 5 errors during collection !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
========================================== 5 errors in 0.52s ==========================================
carsme 12:35 2 tiktoken-0.5.1 $ PYTHONPATH=/tmp/makepkg/python-tiktoken/src/tiktoken-0.5.1/build/lib.linux-x86_64-cpython-311 pytest
========================================= test session starts =========================================
platform linux -- Python 3.11.5, pytest-7.4.2, pluggy-1.3.0
rootdir: /tmp/makepkg/python-tiktoken/src/tiktoken-0.5.1
plugins: mock-3.11.1, hypothesis-6.84.3, mypy-0.10.3
collected 0 items / 5 errors

=============================================== ERRORS ================================================
_______________________________ ERROR collecting tests/test_encoding.py _______________________________
ImportError while importing test module '/tmp/makepkg/python-tiktoken/src/tiktoken-0.5.1/tests/test_encoding.py'.
Hint: make sure your test modules/packages have valid Python names.
Traceback:
/usr/lib/python3.11/importlib/__init__.py:126: in import_module
    return _bootstrap._gcd_import(name[level:], package, level)
tests/test_encoding.py:9: in <module>
    import tiktoken
tiktoken/__init__.py:2: in <module>
    from .core import Encoding as Encoding
tiktoken/core.py:9: in <module>
    from tiktoken import _tiktoken
E   ImportError: cannot import name '_tiktoken' from partially initialized module 'tiktoken' (most likely due to a circular import) (/tmp/makepkg/python-tiktoken/src/tiktoken-0.5.1/tiktoken/__init__.py)
_______________________________ ERROR collecting tests/test_helpers.py ________________________________
ImportError while importing test module '/tmp/makepkg/python-tiktoken/src/tiktoken-0.5.1/tests/test_helpers.py'.
Hint: make sure your test modules/packages have valid Python names.
Traceback:
/usr/lib/python3.11/importlib/__init__.py:126: in import_module
    return _bootstrap._gcd_import(name[level:], package, level)
tests/test_helpers.py:7: in <module>
    import tiktoken
tiktoken/__init__.py:2: in <module>
    from .core import Encoding as Encoding
tiktoken/core.py:9: in <module>
    from tiktoken import _tiktoken
E   ImportError: cannot import name '_tiktoken' from partially initialized module 'tiktoken' (most likely due to a circular import) (/tmp/makepkg/python-tiktoken/src/tiktoken-0.5.1/tiktoken/__init__.py)
_________________________________ ERROR collecting tests/test_misc.py _________________________________
ImportError while importing test module '/tmp/makepkg/python-tiktoken/src/tiktoken-0.5.1/tests/test_misc.py'.
Hint: make sure your test modules/packages have valid Python names.
Traceback:
/usr/lib/python3.11/importlib/__init__.py:126: in import_module
    return _bootstrap._gcd_import(name[level:], package, level)
tests/test_misc.py:4: in <module>
    import tiktoken
tiktoken/__init__.py:2: in <module>
    from .core import Encoding as Encoding
tiktoken/core.py:9: in <module>
    from tiktoken import _tiktoken
E   ImportError: cannot import name '_tiktoken' from partially initialized module 'tiktoken' (most likely due to a circular import) (/tmp/makepkg/python-tiktoken/src/tiktoken-0.5.1/tiktoken/__init__.py)
_______________________________ ERROR collecting tests/test_offsets.py ________________________________
ImportError while importing test module '/tmp/makepkg/python-tiktoken/src/tiktoken-0.5.1/tests/test_offsets.py'.
Hint: make sure your test modules/packages have valid Python names.
Traceback:
/usr/lib/python3.11/importlib/__init__.py:126: in import_module
    return _bootstrap._gcd_import(name[level:], package, level)
tests/test_offsets.py:7: in <module>
    import tiktoken
tiktoken/__init__.py:2: in <module>
    from .core import Encoding as Encoding
tiktoken/core.py:9: in <module>
    from tiktoken import _tiktoken
E   ImportError: cannot import name '_tiktoken' from partially initialized module 'tiktoken' (most likely due to a circular import) (/tmp/makepkg/python-tiktoken/src/tiktoken-0.5.1/tiktoken/__init__.py)
____________________________ ERROR collecting tests/test_simple_public.py _____________________________
ImportError while importing test module '/tmp/makepkg/python-tiktoken/src/tiktoken-0.5.1/tests/test_simple_public.py'.
Hint: make sure your test modules/packages have valid Python names.
Traceback:
/usr/lib/python3.11/importlib/__init__.py:126: in import_module
    return _bootstrap._gcd_import(name[level:], package, level)
tests/test_simple_public.py:4: in <module>
    import tiktoken
tiktoken/__init__.py:2: in <module>
    from .core import Encoding as Encoding
tiktoken/core.py:9: in <module>
    from tiktoken import _tiktoken
E   ImportError: cannot import name '_tiktoken' from partially initialized module 'tiktoken' (most likely due to a circular import) (/tmp/makepkg/python-tiktoken/src/tiktoken-0.5.1/tiktoken/__init__.py)
======================================= short test summary info =======================================
ERROR tests/test_encoding.py
ERROR tests/test_helpers.py
ERROR tests/test_misc.py
ERROR tests/test_offsets.py
ERROR tests/test_simple_public.py
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! Interrupted: 5 errors during collection !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
========================================== 5 errors in 0.51s ==========================================

Just running import tiktoken produces the following error:

$ PYTHONPATH=/tmp/makepkg/python-tiktoken/src/tiktoken-0.5.1/build/lib.linux-x86_64-cpython-311 python
Python 3.11.5 (main, Sep  2 2023, 14:16:33) [GCC 13.2.1 20230801] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import tiktoken
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/tmp/makepkg/python-tiktoken/src/tiktoken-0.5.1/tiktoken/__init__.py", line 2, in <module>
    from .core import Encoding as Encoding
  File "/tmp/makepkg/python-tiktoken/src/tiktoken-0.5.1/tiktoken/core.py", line 9, in <module>
    from tiktoken import _tiktoken
ImportError: cannot import name '_tiktoken' from partially initialized module 'tiktoken' (most likely due to a circular import) (/tmp/makepkg/python-tiktoken/src/tiktoken-0.5.1/tiktoken/__init__.py)

Help very much appreciated, thanks!

carlsmedstad commented 11 months ago

Solved the issue by removing the Python files and pointing to the built package:

check() {
  cd "$_archive"

  rm -r tiktoken tiktoken.egg-info tiktoken_ext
  local python_version
  python_version=$(python -c 'import sys; print("".join(map(str, sys.version_info[:2])))')
  export PYTHONPATH="build/lib.linux-$CARCH-cpython-$python_version"
  echo "$PYTHONPATH"
  python -m pytest
}

Let me know if there is a better solution. Thanks!

hauntsaninja commented 11 months ago

Yeah, you want to make sure you're running tests against the installed version, not the source files in the current directory. Your solution is a solution, another option is to change the default pytest behaviour using the --import-mode flag, like so: https://github.com/openai/tiktoken/blob/39f29cecdb6fc38d9a3434e5dd15e4de58cf3c80/pyproject.toml#L40