huggingface / tokenizers

💥 Fast State-of-the-Art Tokenizers optimized for Research and Production
https://huggingface.co/docs/tokenizers
Apache License 2.0
8.89k stars 769 forks source link

Can not get package to build with Python 3.11 on a minimal linux environment #1092

Closed ZetiMente closed 7 months ago

ZetiMente commented 1 year ago

Command ['/usr/local/bin/python3.11', '-m', 'pip', 'install', '--use-pep517', '--disable-pip-version-check', '--prefix', '/usr/local', '--no-deps', '/root/.cache/pypoetry/artifacts/fa/e3/33/3b69b842d7e79e49fdc7b2401a33069e42457437ececfa7b68b3eea989/tokenizers-0.13.1.tar.gz'] errored with the following return code 1, and output: Processing /root/.cache/pypoetry/artifacts/fa/e3/33/3b69b842d7e79e49fdc7b2401a33069e42457437ececfa7b68b3eea989/tokenizers-0.13.1.tar.gz Installing build dependencies: started Installing build dependencies: finished with status 'done' Getting requirements to build wheel: started Getting requirements to build wheel: finished with status 'done' Preparing metadata (pyproject.toml): started Preparing metadata (pyproject.toml): finished with status 'done' Building wheels for collected packages: tokenizers Building wheel for tokenizers (pyproject.toml): started Building wheel for tokenizers (pyproject.toml): finished with status 'error' error: subprocess-exited-with-error

× Building wheel for tokenizers (pyproject.toml) did not run successfully.
│ exit code: 1
╰─> [393 lines of output]
    running bdist_wheel
    running build
    running build_py
    creating build
    creating build/lib.linux-x86_64-cpython-311
    creating build/lib.linux-x86_64-cpython-311/tokenizers
    copying py_src/tokenizers/__init__.py -> build/lib.linux-x86_64-cpython-311/tokenizers
    creating build/lib.linux-x86_64-cpython-311/tokenizers/models
    copying py_src/tokenizers/models/__init__.py -> build/lib.linux-x86_64-cpython-311/tokenizers/models
    creating build/lib.linux-x86_64-cpython-311/tokenizers/decoders
    copying py_src/tokenizers/decoders/__init__.py -> build/lib.linux-x86_64-cpython-311/tokenizers/decoders
    creating build/lib.linux-x86_64-cpython-311/tokenizers/normalizers
    copying py_src/tokenizers/normalizers/__init__.py -> build/lib.linux-x86_64-cpython-311/tokenizers/normalizers
    creating build/lib.linux-x86_64-cpython-311/tokenizers/pre_tokenizers
    copying py_src/tokenizers/pre_tokenizers/__init__.py -> build/lib.linux-x86_64-cpython-311/tokenizers/pre_tokenizers
    creating build/lib.linux-x86_64-cpython-311/tokenizers/processors
    copying py_src/tokenizers/processors/__init__.py -> build/lib.linux-x86_64-cpython-311/tokenizers/processors
    creating build/lib.linux-x86_64-cpython-311/tokenizers/trainers
    copying py_src/tokenizers/trainers/__init__.py -> build/lib.linux-x86_64-cpython-311/tokenizers/trainers
    creating build/lib.linux-x86_64-cpython-311/tokenizers/implementations
    copying py_src/tokenizers/implementations/sentencepiece_bpe.py -> build/lib.linux-x86_64-cpython-311/tokenizers/implementations
    copying py_src/tokenizers/implementations/byte_level_bpe.py -> build/lib.linux-x86_64-cpython-311/tokenizers/implementations
    copying py_src/tokenizers/implementations/base_tokenizer.py -> build/lib.linux-x86_64-cpython-311/tokenizers/implementations
    copying py_src/tokenizers/implementations/bert_wordpiece.py -> build/lib.linux-x86_64-cpython-311/tokenizers/implementations
    copying py_src/tokenizers/implementations/char_level_bpe.py -> build/lib.linux-x86_64-cpython-311/tokenizers/implementations
    copying py_src/tokenizers/implementations/__init__.py -> build/lib.linux-x86_64-cpython-311/tokenizers/implementations
    copying py_src/tokenizers/implementations/sentencepiece_unigram.py -> build/lib.linux-x86_64-cpython-311/tokenizers/implementations
    creating build/lib.linux-x86_64-cpython-311/tokenizers/tools
    copying py_src/tokenizers/tools/visualizer.py -> build/lib.linux-x86_64-cpython-311/tokenizers/tools
    copying py_src/tokenizers/tools/__init__.py -> build/lib.linux-x86_64-cpython-311/tokenizers/tools
    copying py_src/tokenizers/__init__.pyi -> build/lib.linux-x86_64-cpython-311/tokenizers
    copying py_src/tokenizers/models/__init__.pyi -> build/lib.linux-x86_64-cpython-311/tokenizers/models
    copying py_src/tokenizers/decoders/__init__.pyi -> build/lib.linux-x86_64-cpython-311/tokenizers/decoders
    copying py_src/tokenizers/normalizers/__init__.pyi -> build/lib.linux-x86_64-cpython-311/tokenizers/normalizers
    copying py_src/tokenizers/pre_tokenizers/__init__.pyi -> build/lib.linux-x86_64-cpython-311/tokenizers/pre_tokenizers
    copying py_src/tokenizers/processors/__init__.pyi -> build/lib.linux-x86_64-cpython-311/tokenizers/processors
    copying py_src/tokenizers/trainers/__init__.pyi -> build/lib.linux-x86_64-cpython-311/tokenizers/trainers
    copying py_src/tokenizers/tools/visualizer-styles.css -> build/lib.linux-x86_64-cpython-311/tokenizers/tools
    warning: build_py: byte-compiling is disabled, skipping.

    running build_ext
    running build_rust
    cargo rustc --lib --message-format=json-render-diagnostics --manifest-path Cargo.toml --release -v --features pyo3/extension-module -- --crate-type cdylib

      ------

          Running `rustc --crate-name encoding_rs --edition=2018 /root/.cargo/registry/src/github.com-1ecc6299db9ec823/encoding_rs-0.8.31/src/lib.rs --error-format=json --json=diagnostic-rendered-ansi,artifacts,future-incompat --crate-type lib --emit=dep-info,metadata,link -C opt-level=3 -C embed-bitcode=no --cfg 'feature="alloc"' --cfg 'feature="default"' -C metadata=6ca73ca347d55663 -C extra-filename=-6ca73ca347d55663 --out-dir /tmp/pip-req-build-6vmh7pr5/target/release/deps -L dependency=/tmp/pip-req-build-6vmh7pr5/target/release/deps --extern cfg_if=/tmp/pip-req-build-6vmh7pr5/target/release/deps/libcfg_if-9311e72e94017864.rmeta --cap-lints allow`
       Compiling http-body v0.4.5
         Running `rustc --crate-name http_body --edition=2018 /root/.cargo/registry/src/github.com-1ecc6299db9ec823/http-body-0.4.5/src/lib.rs --error-format=json --json=diagnostic-rendered-ansi,artifacts,future-incompat --crate-type lib --emit=dep-info,metadata,link -C opt-level=3 -C embed-bitcode=no -C metadata=96df02d169766c04 -C extra-filename=-96df02d169766c04 --out-dir /tmp/pip-req-build-6vmh7pr5/target/release/deps -L dependency=/tmp/pip-req-build-6vmh7pr5/target/release/deps --extern bytes=/tmp/pip-req-build-6vmh7pr5/target/release/deps/libbytes-041203b2d8753cce.rmeta --extern http=/tmp/pip-req-build-6vmh7pr5/target/release/deps/libhttp-e7dac65fd54fec91.rmeta --extern pin_project_lite=/tmp/pip-req-build-6vmh7pr5/target/release/deps/libpin_project_lite-bf8a1ba2e0bf0954.rmeta --cap-lints allow`
       Compiling unicode-normalization v0.1.22
         Running `rustc --crate-name unicode_normalization --edition=2018 /root/.cargo/registry/src/github.com-1ecc6299db9ec823/unicode-normalization-0.1.22/src/lib.rs --error-format=json --json=diagnostic-rendered-ansi,artifacts,future-incompat --crate-type lib --emit=dep-info,metadata,link -C opt-level=3 -C embed-bitcode=no --cfg 'feature="default"' --cfg 'feature="std"' -C metadata=62485f9fd6eed65b -C extra-filename=-62485f9fd6eed65b --out-dir /tmp/pip-req-build-6vmh7pr5/target/release/deps -L dependency=/tmp/pip-req-build-6vmh7pr5/target/release/deps --extern tinyvec=/tmp/pip-req-build-6vmh7pr5/target/release/deps/libtinyvec-4b01e18d265893ee.rmeta --cap-lints allow`
         Running `/tmp/pip-req-build-6vmh7pr5/target/release/build/bzip2-sys-066642be35791478/build-script-build`
         Running `/tmp/pip-req-build-6vmh7pr5/target/release/build/openssl-sys-75d16a23de504308/build-script-main`
         Running `/tmp/pip-req-build-6vmh7pr5/target/release/build/esaxx-rs-e94f912bbcd204c5/build-script-build`
    error: failed to run custom build command for `openssl-sys v0.9.77`

    Caused by:
      process didn't exit successfully: `/tmp/pip-req-build-6vmh7pr5/target/release/build/openssl-sys-75d16a23de504308/build-script-main` (exit status: 101)
      --- stdout
      cargo:rustc-cfg=const_fn
      cargo:rustc-cfg=openssl
      cargo:rerun-if-env-changed=X86_64_UNKNOWN_LINUX_GNU_OPENSSL_LIB_DIR
      X86_64_UNKNOWN_LINUX_GNU_OPENSSL_LIB_DIR unset
      cargo:rerun-if-env-changed=OPENSSL_LIB_DIR
      OPENSSL_LIB_DIR unset
      cargo:rerun-if-env-changed=X86_64_UNKNOWN_LINUX_GNU_OPENSSL_INCLUDE_DIR
      X86_64_UNKNOWN_LINUX_GNU_OPENSSL_INCLUDE_DIR unset
      cargo:rerun-if-env-changed=OPENSSL_INCLUDE_DIR
      OPENSSL_INCLUDE_DIR unset
      cargo:rerun-if-env-changed=X86_64_UNKNOWN_LINUX_GNU_OPENSSL_DIR
      X86_64_UNKNOWN_LINUX_GNU_OPENSSL_DIR unset
      cargo:rerun-if-env-changed=OPENSSL_DIR
      OPENSSL_DIR unset
      cargo:rerun-if-env-changed=OPENSSL_NO_PKG_CONFIG
      cargo:rerun-if-env-changed=PKG_CONFIG_x86_64-unknown-linux-gnu
      cargo:rerun-if-env-changed=PKG_CONFIG_x86_64_unknown_linux_gnu
      cargo:rerun-if-env-changed=HOST_PKG_CONFIG
      cargo:rerun-if-env-changed=PKG_CONFIG
      cargo:rerun-if-env-changed=OPENSSL_STATIC
      cargo:rerun-if-env-changed=OPENSSL_DYNAMIC
      cargo:rerun-if-env-changed=PKG_CONFIG_ALL_STATIC
      cargo:rerun-if-env-changed=PKG_CONFIG_ALL_DYNAMIC
      cargo:rerun-if-env-changed=PKG_CONFIG_PATH_x86_64-unknown-linux-gnu
      cargo:rerun-if-env-changed=PKG_CONFIG_PATH_x86_64_unknown_linux_gnu
      cargo:rerun-if-env-changed=HOST_PKG_CONFIG_PATH
      cargo:rerun-if-env-changed=PKG_CONFIG_PATH
      cargo:rerun-if-env-changed=PKG_CONFIG_LIBDIR_x86_64-unknown-linux-gnu
      cargo:rerun-if-env-changed=PKG_CONFIG_LIBDIR_x86_64_unknown_linux_gnu
      cargo:rerun-if-env-changed=HOST_PKG_CONFIG_LIBDIR
      cargo:rerun-if-env-changed=PKG_CONFIG_LIBDIR
      cargo:rerun-if-env-changed=PKG_CONFIG_SYSROOT_DIR_x86_64-unknown-linux-gnu
      cargo:rerun-if-env-changed=PKG_CONFIG_SYSROOT_DIR_x86_64_unknown_linux_gnu
      cargo:rerun-if-env-changed=HOST_PKG_CONFIG_SYSROOT_DIR
      cargo:rerun-if-env-changed=PKG_CONFIG_SYSROOT_DIR
      run pkg_config fail: "`\"pkg-config\" \"--libs\" \"--cflags\" \"openssl\"` did not exit successfully: exit status: 1\nerror: could not find system library 'openssl' required by the 'openssl-sys' crate\n\n--- stderr\nPackage openssl was not found in the pkg-config search path.\nPerhaps you should add the directory containing `openssl.pc'\nto the PKG_CONFIG_PATH environment variable\nNo package 'openssl' found\n"

      --- stderr
      thread 'main' panicked at '

      Could not find directory of OpenSSL installation, and this `-sys` crate cannot
      proceed without this knowledge. If OpenSSL is installed and this crate had
      trouble finding it,  you can set the `OPENSSL_DIR` environment variable for the
      compilation process.

      Make sure you also have the development packages of openssl installed.
      For example, `libssl-dev` on Ubuntu or `openssl-devel` on Fedora.

      If you're in a situation where you think the directory *should* be found
      automatically, please open a bug at https://github.com/sfackler/rust-openssl
      and include information about your system as well as this message.

      $HOST = x86_64-unknown-linux-gnu
      $TARGET = x86_64-unknown-linux-gnu
      openssl-sys = 0.9.77

      ', /root/.cargo/registry/src/github.com-1ecc6299db9ec823/openssl-sys-0.9.77/build/find_normal.rs:191:5
      note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace
    warning: build failed, waiting for other jobs to finish...
    error: `cargo rustc --lib --message-format=json-render-diagnostics --manifest-path Cargo.toml --release -v --features pyo3/extension-module -- --crate-type cdylib` failed with code 101
    [end of output]

note: This error originates from a subprocess, and is likely not a problem with pip.
ERROR: Failed building wheel for tokenizers

Failed to build tokenizers ERROR: Could not build wheels for tokenizers, which is required to install pyproject.toml-based projects

ZetiMente commented 1 year ago

I'm sorry confused, doesn't maturin build packages for python-rust wheels ? Why is it trying to compile rust, shouldn't that already be done?

ZetiMente commented 1 year ago

It says I don't have openssl-sys v0.9.7 but

oot@80c5977f7fec:/home/mind/dreambooth# apt install openssl Reading package lists... Done Building dependency tree
Reading state information... Done openssl is already the newest version (1.1.1n-0+deb10u3). 0 upgraded, 0 newly installed, 0 to remove and 4 not upgraded. root@80c5977f7fec:/home/# apt install openssl-sys Reading package lists... Done Building dependency tree
Reading state information... Done E: Unable to locate package openssl-sys

Narsil commented 1 year ago

Hi @ZetiMente ,

Can you try installing from source and see if that works ? https://huggingface.co/docs/tokenizers/installation#installation-from-sources

github-actions[bot] commented 7 months ago

This issue is stale because it has been open 30 days with no activity. Remove stale label or comment or this will be closed in 5 days.