openai / tiktoken

tiktoken is a fast BPE tokeniser for use with OpenAI's models.
MIT License
12.48k stars 856 forks source link

Can't install tiktoken in Python 3.12 #205

Closed pamelafox closed 11 months ago

pamelafox commented 1 year ago

We are trying to install tiktoken in Python 3.12, but get an error:

Collecting tiktoken
  Using cached tiktoken-0.5.1.tar.gz (32 kB)
  Installing build dependencies ... done
  Getting requirements to build wheel ... done
  Preparing metadata (pyproject.toml) ... done
Collecting regex>=2022.1.18 (from tiktoken)
  Obtaining dependency information for regex>=2022.1.18 from https://files.pythonhosted.org/packages/38/a4/645e381727142609772a37c50d2f4b0316bbfa40a6e5b1ad27f8493767f4/regex-2023.10.3-cp312-cp312-manylinux_2_17_aarch64.manylinux2014_aarch64.whl.metadata
  Downloading regex-2023.10.3-cp312-cp312-manylinux_2_17_aarch64.manylinux2014_aarch64.whl.metadata (40 kB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 40.9/40.9 kB 3.2 MB/s eta 0:00:00
Collecting requests>=2.26.0 (from tiktoken)
  Obtaining dependency information for requests>=2.26.0 from https://files.pythonhosted.org/packages/70/8e/0e2d847013cb52cd35b38c009bb167a1a26b2ce6cd6965bf26b47bc0bf44/requests-2.31.0-py3-none-any.whl.metadata
  Using cached requests-2.31.0-py3-none-any.whl.metadata (4.6 kB)
Collecting charset-normalizer<4,>=2 (from requests>=2.26.0->tiktoken)
  Obtaining dependency information for charset-normalizer<4,>=2 from https://files.pythonhosted.org/packages/a8/97/3c26f65a6bfb16cc3d66c973e966516f54fa5f6e512e20e2da1a99b7c480/charset_normalizer-3.3.0-cp312-cp312-manylinux_2_17_aarch64.manylinux2014_aarch64.whl.metadata
  Using cached charset_normalizer-3.3.0-cp312-cp312-manylinux_2_17_aarch64.manylinux2014_aarch64.whl.metadata (32 kB)
Collecting idna<4,>=2.5 (from requests>=2.26.0->tiktoken)
  Using cached idna-3.4-py3-none-any.whl (61 kB)
Collecting urllib3<3,>=1.21.1 (from requests>=2.26.0->tiktoken)
  Obtaining dependency information for urllib3<3,>=1.21.1 from https://files.pythonhosted.org/packages/26/40/9957270221b6d3e9a3b92fdfba80dd5c9661ff45a664b47edd5d00f707f5/urllib3-2.0.6-py3-none-any.whl.metadata
  Downloading urllib3-2.0.6-py3-none-any.whl.metadata (6.6 kB)
Collecting certifi>=2017.4.17 (from requests>=2.26.0->tiktoken)
  Obtaining dependency information for certifi>=2017.4.17 from https://files.pythonhosted.org/packages/4c/dd/2234eab22353ffc7d94e8d13177aaa050113286e93e7b40eae01fbf7c3d9/certifi-2023.7.22-py3-none-any.whl.metadata
  Using cached certifi-2023.7.22-py3-none-any.whl.metadata (2.2 kB)
Downloading regex-2023.10.3-cp312-cp312-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (786 kB)
   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 786.0/786.0 kB 7.5 MB/s eta 0:00:00
Using cached requests-2.31.0-py3-none-any.whl (62 kB)
Using cached certifi-2023.7.22-py3-none-any.whl (158 kB)
Using cached charset_normalizer-3.3.0-cp312-cp312-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (134 kB)
Downloading urllib3-2.0.6-py3-none-any.whl (123 kB)
   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 123.8/123.8 kB 39.2 MB/s eta 0:00:00
Building wheels for collected packages: tiktoken
  Building wheel for tiktoken (pyproject.toml) ... error
  error: subprocess-exited-with-error

  × Building wheel for tiktoken (pyproject.toml) did not run successfully.
  │ exit code: 1
  ╰─> [38 lines of output]
      running bdist_wheel
      running build
      running build_py
      creating build
      creating build/lib.linux-aarch64-cpython-312
      creating build/lib.linux-aarch64-cpython-312/tiktoken
      copying tiktoken/core.py -> build/lib.linux-aarch64-cpython-312/tiktoken
      copying tiktoken/__init__.py -> build/lib.linux-aarch64-cpython-312/tiktoken
      copying tiktoken/registry.py -> build/lib.linux-aarch64-cpython-312/tiktoken
      copying tiktoken/model.py -> build/lib.linux-aarch64-cpython-312/tiktoken
      copying tiktoken/load.py -> build/lib.linux-aarch64-cpython-312/tiktoken
      copying tiktoken/_educational.py -> build/lib.linux-aarch64-cpython-312/tiktoken
      creating build/lib.linux-aarch64-cpython-312/tiktoken_ext
      copying tiktoken_ext/openai_public.py -> build/lib.linux-aarch64-cpython-312/tiktoken_ext
      running egg_info
      writing tiktoken.egg-info/PKG-INFO
      writing dependency_links to tiktoken.egg-info/dependency_links.txt
      writing requirements to tiktoken.egg-info/requires.txt
      writing top-level names to tiktoken.egg-info/top_level.txt
      reading manifest file 'tiktoken.egg-info/SOURCES.txt'
      reading manifest template 'MANIFEST.in'
      warning: no files found matching 'Makefile'
      adding license file 'LICENSE'
      writing manifest file 'tiktoken.egg-info/SOURCES.txt'
      copying tiktoken/py.typed -> build/lib.linux-aarch64-cpython-312/tiktoken
      running build_ext
      running build_rust
      error: can't find Rust compiler

Can 3.12 wheels be released?

JoshJarabek7 commented 1 year ago

@pamelafox It works fine on Python3.12 if you have a rust compiler.

If you're building it in Docker, make sure you add it to your path.

RUN apt-get -y install curl build-essential gcc make && curl https://sh.rustup.rs -sSf | sh -s -- -y
ENV PATH="/root/.cargo/bin:${PATH}"
junoriosity commented 1 year ago

@JoshJarabek7 Many thanks for your input. However, if I use Python 3.11 it works like a charm.

Why is it suddenly causing trouble?

JoshJarabek7 commented 1 year ago

@junoriosity not sure. Could be a few things. More than likely it's because Python 3.12 removed setuptools, which helps build Python packages. You can still install it via pip if you need it if you want to try that before pip installing tiktoken. You can also try installing via binary in pip instead of downloading setuptools.

Another reason could be that tiktoken is written in Rust (not sure if it was before or not) for some speed gains solely for Python 3.12.

junoriosity commented 1 year ago

@JoshJarabek7 Essentially, my Dockerfile looks like this

FROM python:3.12-alpine

RUN pip install --upgrade pip && pip install setuptools==69.0.1
RUN pip install tiktoken==0.5.1

and I get the error

4.957 Building wheels for collected packages: tiktoken
4.958   Building wheel for tiktoken (pyproject.toml): started
5.269   Building wheel for tiktoken (pyproject.toml): finished with status 'error'
5.277   error: subprocess-exited-with-error
5.277
5.277   × Building wheel for tiktoken (pyproject.toml) did not run successfully.
5.277   │ exit code: 1
5.277   ╰─> [38 lines of output]
5.277       running bdist_wheel
5.277       running build
5.277       running build_py
5.277       creating build
5.277       creating build/lib.linux-x86_64-cpython-312
5.277       creating build/lib.linux-x86_64-cpython-312/tiktoken
5.277       copying tiktoken/__init__.py -> build/lib.linux-x86_64-cpython-312/tiktoken
5.277       copying tiktoken/model.py -> build/lib.linux-x86_64-cpython-312/tiktoken
5.277       copying tiktoken/registry.py -> build/lib.linux-x86_64-cpython-312/tiktoken
5.277       copying tiktoken/load.py -> build/lib.linux-x86_64-cpython-312/tiktoken
5.277       copying tiktoken/_educational.py -> build/lib.linux-x86_64-cpython-312/tiktoken
5.277       copying tiktoken/core.py -> build/lib.linux-x86_64-cpython-312/tiktoken
5.277       creating build/lib.linux-x86_64-cpython-312/tiktoken_ext
5.277       copying tiktoken_ext/openai_public.py -> build/lib.linux-x86_64-cpython-312/tiktoken_ext
5.277       running egg_info
5.277       writing tiktoken.egg-info/PKG-INFO
5.277       writing dependency_links to tiktoken.egg-info/dependency_links.txt
5.277       writing requirements to tiktoken.egg-info/requires.txt
5.277       writing top-level names to tiktoken.egg-info/top_level.txt
5.277       reading manifest file 'tiktoken.egg-info/SOURCES.txt'
5.277       reading manifest template 'MANIFEST.in'
5.277       warning: no files found matching 'Makefile'
5.277       adding license file 'LICENSE'
5.277       writing manifest file 'tiktoken.egg-info/SOURCES.txt'
5.277       copying tiktoken/py.typed -> build/lib.linux-x86_64-cpython-312/tiktoken
5.277       running build_ext
5.277       running build_rust
5.277       error: can't find Rust compiler
5.277
5.277       If you are using an outdated pip version, it is possible a prebuilt wheel is available for this package but pip is not able to install from it. Installing from the wheel would avoid the need for a Rust compiler.
5.277
5.277       To update pip, run:
5.277
5.277           pip install --upgrade pip
5.277
5.277       and then retry package installation.
5.277
5.277       If you did intend to build this package from source, try installing a Rust compiler from your system package manager and ensure it is on the PATH during installation. Alternatively, rustup (available at https://rustup.rs) is the recommended way to download and update the Rust compiler toolchain.
5.277       [end of output]
5.277
5.277   note: This error originates from a subprocess, and is likely not a problem with pip.
5.278   ERROR: Failed building wheel for tiktoken
5.278 Failed to build tiktoken
5.278 ERROR: Could not build wheels for tiktoken, which is required to install pyproject.toml-based projects
------
failed to solve: process "/bin/sh -c pip install tiktoken==0.5.1" did not complete successfully: exit code: 1

Perhaps you have an idea how to overcome this. 😊

pamelafox commented 1 year ago

@JoshJarabek7 Thanks for the suggestion! We are using it an open source sample for developers on many systems (Mac/Linux/Windows) so we can't assume they'll have a Rust compiler. We eagerly await formal support via a built wheel.

hauntsaninja commented 11 months ago

tiktoken 0.5.2 ships Python 3.12 wheels :-)

bigf625oot commented 4 months ago
image

python 3.12.4 tiktoken7.0 error