huggingface / transformers

🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.
https://huggingface.co/transformers
Apache License 2.0
134.92k stars 26.99k forks source link

error: casting `&T` to `&mut T` is undefined behavior, even if the reference is unused, consider instead using an `UnsafeCell` --> tokenizers-lib/src/models/bpe/trainer.rs:517:47 #29576

Closed dbl001 closed 5 months ago

dbl001 commented 8 months ago

System Info

2024-03-11 01:14:30.782590: E external/local_xla/xla/stream_executor/cuda/cuda_dnn.cc:9261] Unable to register cuDNN factory: Attempting to register factory for plugin cuDNN when one has already been registered
2024-03-11 01:14:30.782649: E external/local_xla/xla/stream_executor/cuda/cuda_fft.cc:607] Unable to register cuFFT factory: Attempting to register factory for plugin cuFFT when one has already been registered
2024-03-11 01:14:30.784014: E external/local_xla/xla/stream_executor/cuda/cuda_blas.cc:1515] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered
2024-03-11 01:14:31.954016: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:38] TF-TRT Warning: Could not find TensorRT
WARNING:tensorflow:From /usr/local/lib/python3.10/dist-packages/transformers/commands/env.py:100: is_gpu_available (from tensorflow.python.framework.test_util) is deprecated and will be removed in a future version.
Instructions for updating:
Use `tf.config.list_physical_devices('GPU')` instead.
2024-03-11 01:14:34.928846: W tensorflow/core/common_runtime/gpu/gpu_bfc_allocator.cc:47] Overriding orig_value setting because the TF_FORCE_GPU_ALLOW_GROWTH environment variable is set. Original config value was 0.
CUDA backend failed to initialize: Found cuBLAS version 120103, but JAX was built against version 120205, which is newer. The copy of cuBLAS that is installed must be at least as new as the version against which JAX was built. (Set TF_CPP_MIN_LOG_LEVEL=0 and rerun for more info.)

Copy-and-paste the text below in your GitHub issue and FILL OUT the two last points.

- `transformers` version: 4.38.2
- Platform: Linux-6.1.58+-x86_64-with-glibc2.35
- Python version: 3.10.12
- Huggingface_hub version: 0.20.3
- Safetensors version: 0.4.2
- Accelerate version: not installed
- Accelerate config: not found
- PyTorch version (GPU?): 2.1.0+cu121 (True)
- Tensorflow version (GPU?): 2.15.0 (True)
- Flax version (CPU?/GPU?/TPU?): 0.8.1 (cpu)
- Jax version: 0.4.23
- JaxLib version: 0.4.23
- Using GPU in script?: <fill in>
- Using distributed or parallel set-up in script?: <fill in>

Who can help?

No response

Information

Tasks

Reproduction

On Google Colab trying to install transformers 4.0.6 I installed rust 1.76.0 which got: error: casting &T to &mut T Then I tried rust 1.72.0 which was supposed to be less sensitive.

!pip install --upgrade transformers==4.06 --verbose
warning: `#[macro_use]` only has an effect on `extern crate` and modules
    --> tokenizers-lib/src/utils/mod.rs:24:1
     |
  24 | #[macro_use]
     | ^^^^^^^^^^^^
     |
     = note: `#[warn(unused_attributes)]` on by default

  warning: `#[macro_use]` only has an effect on `extern crate` and modules
    --> tokenizers-lib/src/utils/mod.rs:35:1
     |
  35 | #[macro_use]
     | ^^^^^^^^^^^^

  warning: variable does not need to be mutable
     --> tokenizers-lib/src/models/unigram/model.rs:280:21
      |
  280 |                 let mut target_node = &mut best_path_ends_at[key_pos];
      |                     ----^^^^^^^^^^^
      |                     |
      |                     help: remove this `mut`
      |
      = note: `#[warn(unused_mut)]` on by default

  warning: variable does not need to be mutable
     --> tokenizers-lib/src/models/unigram/model.rs:297:21
      |
  297 |                 let mut target_node = &mut best_path_ends_at[starts_at + mblen];
      |                     ----^^^^^^^^^^^
      |                     |
      |                     help: remove this `mut`

  warning: variable does not need to be mutable
     --> tokenizers-lib/src/pre_tokenizers/byte_level.rs:175:59
      |
  175 |     encoding.process_tokens_with_offsets_mut(|(i, (token, mut offsets))| {
      |                                                           ----^^^^^^^
      |                                                           |
      |                                                           help: remove this `mut`

  warning: fields `bos_id` and `eos_id` are never read
    --> tokenizers-lib/src/models/unigram/lattice.rs:59:5
     |
  53 | pub struct Lattice<'a> {
     |            ------- fields in this struct
  ...
  59 |     bos_id: usize,
     |     ^^^^^^
  60 |     eos_id: usize,
     |     ^^^^^^
     |
     = note: `Lattice` has a derived impl for the trait `Debug`, but this is intentionally ignored during dead code analysis
     = note: `#[warn(dead_code)]` on by default

  error: casting `&T` to `&mut T` is undefined behavior, even if the reference is unused, consider instead using an `UnsafeCell`
     --> tokenizers-lib/src/models/bpe/trainer.rs:517:47
      |
  513 |                     let w = &words[*i] as *const _ as *mut _;
      |                             -------------------------------- casting happend here
  ...
  517 |                         let word: &mut Word = &mut (*w);
      |                                               ^^^^^^^^^
      |
      = note: for more information, visit <https://doc.rust-lang.org/book/ch15-05-interior-mutability.html>
      = note: `#[deny(invalid_reference_casting)]` on by default

  warning: `tokenizers` (lib) generated 6 warnings
  error: could not compile `tokenizers` (lib) due to 1 previous error; 6 warnings emitted

  Caused by:
    process didn't exit successfully: `/root/.rustup/toolchains/stable-x86_64-unknown-linux-gnu/bin/rustc --crate-name tokenizers --edition=2018 tokenizers-lib/src/lib.rs --error-format=json --json=diagnostic-rendered-ansi,artifacts,future-incompat --crate-type lib --emit=dep-info,metadata,link -C opt-level=3 -C embed-bitcode=no --cfg 'feature="default"' --cfg 'feature="indicatif"' --cfg 'feature="progressbar"' -C metadata=b4902f315560f1ee -C extra-filename=-b4902f315560f1ee --out-dir /tmp/pip-install-rotcz5nj/tokenizers_f211137d6c704baa977bfe0569424256/target/release/deps -L dependency=/tmp/pip-install-rotcz5nj/tokenizers_f211137d6c704baa977bfe0569424256/target/release/deps --extern clap=/tmp/pip-install-rotcz5nj/tokenizers_f211137d6c704baa977bfe0569424256/target/release/deps/libclap-2ed1bc4e1f137d6a.rmeta --extern derive_builder=/tmp/pip-install-rotcz5nj/tokenizers_f211137d6c704baa977bfe0569424256/target/release/deps/libderive_builder-927868a0edb8a08b.so --extern esaxx_rs=/tmp/pip-install-rotcz5nj/tokenizers_f211137d6c704baa977bfe0569424256/target/release/deps/libesaxx_rs-00367ded6e9df21a.rmeta --extern indicatif=/tmp/pip-install-rotcz5nj/tokenizers_f211137d6c704baa977bfe0569424256/target/release/deps/libindicatif-893b81a84fee081a.rmeta --extern itertools=/tmp/pip-install-rotcz5nj/tokenizers_f211137d6c704baa977bfe0569424256/target/release/deps/libitertools-051f3c77bf3684bc.rmeta --extern lazy_static=/tmp/pip-install-rotcz5nj/tokenizers_f211137d6c704baa977bfe0569424256/target/release/deps/liblazy_static-df89fd9b4b197d62.rmeta --extern log=/tmp/pip-install-rotcz5nj/tokenizers_f211137d6c704baa977bfe0569424256/target/release/deps/liblog-db5663930c6645cc.rmeta --extern onig=/tmp/pip-install-rotcz5nj/tokenizers_f211137d6c704baa977bfe0569424256/target/release/deps/libonig-ab094d5df50c1ae3.rmeta --extern rand=/tmp/pip-install-rotcz5nj/tokenizers_f211137d6c704baa977bfe0569424256/target/release/deps/librand-57abfece9e5d7a1e.rmeta --extern rayon=/tmp/pip-install-rotcz5nj/tokenizers_f211137d6c704baa977bfe0569424256/target/release/deps/librayon-29cb179ffa5164fd.rmeta --extern rayon_cond=/tmp/pip-install-rotcz5nj/tokenizers_f211137d6c704baa977bfe0569424256/target/release/deps/librayon_cond-f3b239ca8b442c66.rmeta --extern regex=/tmp/pip-install-rotcz5nj/tokenizers_f211137d6c704baa977bfe0569424256/target/release/deps/libregex-48a23c12665b1ac6.rmeta --extern regex_syntax=/tmp/pip-install-rotcz5nj/tokenizers_f211137d6c704baa977bfe0569424256/target/release/deps/libregex_syntax-ace402a25abfd585.rmeta --extern serde=/tmp/pip-install-rotcz5nj/tokenizers_f211137d6c704baa977bfe0569424256/target/release/deps/libserde-00a50b461a53bfab.rmeta --extern serde_json=/tmp/pip-install-rotcz5nj/tokenizers_f211137d6c704baa977bfe0569424256/target/release/deps/libserde_json-fef87182d967f2a8.rmeta --extern spm_precompiled=/tmp/pip-install-rotcz5nj/tokenizers_f211137d6c704baa977bfe0569424256/target/release/deps/libspm_precompiled-af1cd270a9f7042e.rmeta --extern unicode_normalization_alignments=/tmp/pip-install-rotcz5nj/tokenizers_f211137d6c704baa977bfe0569424256/target/release/deps/libunicode_normalization_alignments-a1711ea2b5cfdc20.rmeta --extern unicode_segmentation=/tmp/pip-install-rotcz5nj/tokenizers_f211137d6c704baa977bfe0569424256/target/release/deps/libunicode_segmentation-0df53fbf44393ad7.rmeta --extern unicode_categories=/tmp/pip-install-rotcz5nj/tokenizers_f211137d6c704baa977bfe0569424256/target/release/deps/libunicode_categories-7c6fabd07afa2a56.rmeta -L native=/tmp/pip-install-rotcz5nj/tokenizers_f211137d6c704baa977bfe0569424256/target/release/build/esaxx-rs-17f45370f913980e/out -L native=/tmp/pip-install-rotcz5nj/tokenizers_f211137d6c704baa977bfe0569424256/target/release/build/onig_sys-1b013bbbe8847e4a/out` (exit status: 1)
  warning: build failed, waiting for other jobs to finish...
  error: `cargo rustc --lib --message-format=json-render-diagnostics --manifest-path Cargo.toml --release -v --features pyo3/extension-module --crate-type cdylib --` failed with code 101
  error: subprocess-exited-with-error

  × Building wheel for tokenizers (pyproject.toml) did not run successfully.
  │ exit code: 1
  ╰─> See above for output.

  note: This error originates from a subprocess, and is likely not a problem with pip.
  full command: /usr/bin/python3 /usr/local/lib/python3.10/dist-packages/pip/_vendor/pyproject_hooks/_in_process/_in_process.py build_wheel /tmp/tmp_wjb9r9d
  cwd: /tmp/pip-install-rotcz5nj/tokenizers_f211137d6c704baa977bfe0569424256
  Building wheel for tokenizers (pyproject.toml) ... error
  ERROR: Failed building wheel for tokenizers
Failed to build tokenizers
ERROR: Could not build wheels for tokenizers, which is required to install pyproject.toml-based projects

I tried with different versions of rust. rust 1.72.0 is supposed to work.

!rustup toolchain install 1.72.0
!rustup default 1.72.0
!rustc --version

Expected behavior

Install transformers v4.06 on Google Colab.

amyeroberts commented 8 months ago

Hi @dbl001, thanks for raising!

What's the reason for trying to install v4.6 of transformers? This is almost 3 years old. I'd advise trying to use the most recent version pip install -U transformers

Note: The version you'd want to install would be 4.6.1 - as we don't have 0 in front of the version if it's < 10, and this would include a patch release.

dbl001 commented 8 months ago

I'm trying to run : https://www.lesswrong.com/posts/AcKRB8wDpdaN6v6ru/interpreting-gpt-the-logit-lens https://colab.research.google.com/drive/1MjdfK2srcerLrAJDRaJQKO0sUiZ-hQtA?usp=sharing#scrollTo=11X1oaLHxKMQ

Which calls for:

%pip install git+https://github.com/finetuneanon/transformers/@gpt-neo-localattention --verbose
amyeroberts commented 8 months ago

@dbl001 The pip install call there is installing a version of transformers which is a fork. If there's installation issues, you should raise it on that repo https://github.com/finetuneanon/transformers/

dbl001 commented 8 months ago

That's true, but it happens here as well:

!pip install --upgrade transformers==4.06 --verbose
amyeroberts commented 8 months ago

@dbl001 Have you tried with a more recent transformers version? I'd try installing both the most recent transformers and tokenizers.

dbl001 commented 8 months ago

Yes,

!pip install transformers
Collecting transformers
  Downloading transformers-4.38.2-py3-none-any.whl (8.5 MB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 8.5/8.5 MB 66.2 MB/s eta 0:00:00
Requirement already satisfied: filelock in /usr/local/lib/python3.10/dist-packages (from transformers) (3.13.1)
Requirement already satisfied: huggingface-hub<1.0,>=0.19.3 in /usr/local/lib/python3.10/dist-packages (from transformers) (0.20.3)
Requirement already satisfied: numpy>=1.17 in /usr/local/lib/python3.10/dist-packages (from transformers) (1.25.2)
Requirement already satisfied: packaging>=20.0 in /usr/local/lib/python3.10/dist-packages (from transformers) (23.2)
Requirement already satisfied: pyyaml>=5.1 in /usr/local/lib/python3.10/dist-packages (from transformers) (6.0.1)
Requirement already satisfied: regex!=2019.12.17 in /usr/local/lib/python3.10/dist-packages (from transformers) (2023.12.25)
Requirement already satisfied: requests in /usr/local/lib/python3.10/dist-packages (from transformers) (2.31.0)
Requirement already satisfied: tokenizers<0.19,>=0.14 in /usr/local/lib/python3.10/dist-packages (from transformers) (0.15.2)
Requirement already satisfied: safetensors>=0.4.1 in /usr/local/lib/python3.10/dist-packages (from transformers) (0.4.2)
Requirement already satisfied: tqdm>=4.27 in /usr/local/lib/python3.10/dist-packages (from transformers) (4.66.2)
Requirement already satisfied: fsspec>=2023.5.0 in /usr/local/lib/python3.10/dist-packages (from huggingface-hub<1.0,>=0.19.3->transformers) (2023.6.0)
Requirement already satisfied: typing-extensions>=3.7.4.3 in /usr/local/lib/python3.10/dist-packages (from huggingface-hub<1.0,>=0.19.3->transformers) (4.10.0)
Requirement already satisfied: charset-normalizer<4,>=2 in /usr/local/lib/python3.10/dist-packages (from requests->transformers) (3.3.2)
Requirement already satisfied: idna<4,>=2.5 in /usr/local/lib/python3.10/dist-packages (from requests->transformers) (3.6)
Requirement already satisfied: urllib3<3,>=1.21.1 in /usr/local/lib/python3.10/dist-packages (from requests->transformers) (2.0.7)
Requirement already satisfied: certifi>=2017.4.17 in /usr/local/lib/python3.10/dist-packages (from requests->transformers) (2024.2.2)
Installing collected packages: transformers
Successfully installed transformers-4.38.2

!pip show transformers
Name: transformers
Version: 4.38.2
Summary: State-of-the-art Machine Learning for JAX, PyTorch and TensorFlow
Home-page: https://github.com/huggingface/transformers
Author: The Hugging Face team (past and future) with the help of all our contributors (https://github.com/huggingface/transformers/graphs/contributors)
Author-email: [transformers@huggingface.co](mailto:transformers@huggingface.co)
License: Apache 2.0 License
Location: /usr/local/lib/python3.10/dist-packages
Requires: filelock, huggingface-hub, numpy, packaging, pyyaml, regex, requests, safetensors, tokenizers, tqdm
Required-by:
dbl001 commented 8 months ago

This is another reason why I tried to install a prior version:

model = transformers.AutoModelForCausalLM.from_pretrained('gpt2')
---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
[<ipython-input-10-d27f03d33268>](https://localhost:8080/#) in <cell line: 1>()
----> 1 model = transformers.AutoModelForCausalLM.from_pretrained('gpt2')

3 frames
[/usr/local/lib/python3.10/dist-packages/transformer_utils/util/tfm_utils.py](https://localhost:8080/#) in get_local_path_from_huggingface_cdn(key, filename)
     25 
     26 def get_local_path_from_huggingface_cdn(key, filename):
---> 27     archive_file = transformers.file_utils.hf_bucket_url(
     28         key,
     29         filename=filename,

AttributeError: module 'transformers.file_utils' has no attribute 'hf_bucket_url'

It seems the project is just too old.

amyeroberts commented 8 months ago

@ther-0319 Not sure what you mean by 'the project', but yes, whatever is trying to import hf_bucket_url from transformers is out of date. hf_bucket_url was removed in https://github.com/huggingface/transformers/pull/18497 ~1.5 years ago.

github-actions[bot] commented 7 months ago

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

Please note that issues that do not follow the contributing guidelines are likely to be ignored.

voxoid0 commented 7 months ago

I'm having the same problem today. Not sure what to do. Tried installing the latest transformers manually also.

amyeroberts commented 7 months ago

@voxoid0 Could you provide a minimal code reproducer and information about the running environment?

brand17 commented 7 months ago

I am getting this error after pip install 'transformers[tf-cpu]' on Windows 10. I installed Rust x64.


      error: casting `&T` to `&mut T` is undefined behavior, even if the reference is unused, consider instead using an `UnsafeCell`
         --> tokenizers-lib\src\models\bpe\trainer.rs:526:47
          |
      522 |                     let w = &words[*i] as *const _ as *mut _;
          |                             -------------------------------- casting happend here      ...
      526 |                         let word: &mut Word = &mut (*w);
          |                                               ^^^^^^^^^
          |
          = note: for more information, visit <https://doc.rust-lang.org/book/ch15-05-interior-mutability.html>
          = note: `#[deny(invalid_reference_casting)]` on by default

      warning: `tokenizers` (lib) generated 3 warnings
      error: could not compile `tokenizers` (lib) due to 1 previous error; 3 warnings emitted     
      Caused by:
        process didn't exit successfully: `rustc --crate-name tokenizers --edition=2018 tokenizers-lib\src/lib.rs --error-format=json --json=diagnostic-rendered-ansi,artifacts,future-incompat --crate-type lib --emit=dep-info,metadata,link -C opt-level=3 -C embed-bitcode=no --cfg "feature=\"cached-path\"" --cfg "feature=\"clap\"" --cfg "feature=\"cli\"" --cfg "feature=\"default\"" --cfg "feature=\"http\"" --cfg "feature=\"indicatif\"" --cfg "feature=\"progressbar\"" --cfg "feature=\"reqwest\"" -C metadata=f78603c61a345010 -C extra-filename=-f78603c61a345010 --out-dir C:\Users\brand17\AppData\Local\Temp\pip-install-y0fjj6wa\tokenizers_2db86e04c1fa4183838a80e6696d7401\target\release\deps -L dependency=C:\Users\brand17\AppData\Local\Temp\pip-install-y0fjj6wa\tokenizers_2db86e04c1fa4183838a80e6696d7401\target\release\deps --extern aho_corasick=C:\Users\brand17\AppData\Local\Temp\pip-install-y0fjj6wa\tokenizers_2db86e04c1fa4183838a80e6696d7401\target\release\deps\libaho_corasick-2e6901a7f66c9729.rmeta --extern cached_path=C:\Users\brand17\AppData\Local\Temp\pip-install-y0fjj6wa\tokenizers_2db86e04c1fa4183838a80e6696d7401\target\release\deps\libcached_path-6c9c0b34d6e14b68.rmeta --extern clap=C:\Users\brand17\AppData\Local\Temp\pip-install-y0fjj6wa\tokenizers_2db86e04c1fa4183838a80e6696d7401\target\release\deps\libclap-4a51d8c515ac8ae2.rmeta --extern derive_builder=C:\Users\brand17\AppData\Local\Temp\pip-install-y0fjj6wa\tokenizers_2db86e04c1fa4183838a80e6696d7401\target\release\deps\derive_builder-299cb140b79546da.dll --extern dirs=C:\Users\brand17\AppData\Local\Temp\pip-install-y0fjj6wa\tokenizers_2db86e04c1fa4183838a80e6696d7401\target\release\deps\libdirs-ee2f64e9937341ba.rmeta --extern esaxx_rs=C:\Users\brand17\AppData\Local\Temp\pip-install-y0fjj6wa\tokenizers_2db86e04c1fa4183838a80e6696d7401\target\release\deps\libesaxx_rs-74ffd34952c5221d.rmeta --extern indicatif=C:\Users\brand17\AppData\Local\Temp\pip-install-y0fjj6wa\tokenizers_2db86e04c1fa4183838a80e6696d7401\target\release\deps\libindicatif-63758582f029b551.rmeta --extern itertools=C:\Users\brand17\AppData\Local\Temp\pip-install-y0fjj6wa\tokenizers_2db86e04c1fa4183838a80e6696d7401\target\release\deps\libitertools-94d15f64998aca8b.rmeta --extern lazy_static=C:\Users\brand17\AppData\Local\Temp\pip-install-y0fjj6wa\tokenizers_2db86e04c1fa4183838a80e6696d7401\target\release\deps\liblazy_static-5f9444fdb29d44b7.rmeta --extern log=C:\Users\brand17\AppData\Local\Temp\pip-install-y0fjj6wa\tokenizers_2db86e04c1fa4183838a80e6696d7401\target\release\deps\liblog-d84d806982d24cc6.rmeta --extern macro_rules_attribute=C:\Users\brand17\AppData\Local\Temp\pip-install-y0fjj6wa\tokenizers_2db86e04c1fa4183838a80e6696d7401\target\release\deps\libmacro_rules_attribute-a521ac5efa524edc.rmeta --extern onig=C:\Users\brand17\AppData\Local\Temp\pip-install-y0fjj6wa\tokenizers_2db86e04c1fa4183838a80e6696d7401\target\release\deps\libonig-d6f17307cd48595c.rmeta --extern paste=C:\Users\brand17\AppData\Local\Temp\pip-install-y0fjj6wa\tokenizers_2db86e04c1fa4183838a80e6696d7401\target\release\deps\paste-a29194707c74bce2.dll --extern rand=C:\Users\brand17\AppData\Local\Temp\pip-install-y0fjj6wa\tokenizers_2db86e04c1fa4183838a80e6696d7401\target\release\deps\librand-63b3ae7e14d3c967.rmeta --extern rayon=C:\Users\brand17\AppData\Local\Temp\pip-install-y0fjj6wa\tokenizers_2db86e04c1fa4183838a80e6696d7401\target\release\deps\librayon-eebc8b783646e934.rmeta --extern rayon_cond=C:\Users\brand17\AppData\Local\Temp\pip-install-y0fjj6wa\tokenizers_2db86e04c1fa4183838a80e6696d7401\target\release\deps\librayon_cond-fef04d7b1eb6361d.rmeta --extern regex=C:\Users\brand17\AppData\Local\Temp\pip-install-y0fjj6wa\tokenizers_2db86e04c1fa4183838a80e6696d7401\target\release\deps\libregex-168971063707a96f.rmeta --extern regex_syntax=C:\Users\brand17\AppData\Local\Temp\pip-install-y0fjj6wa\tokenizers_2db86e04c1fa4183838a80e6696d7401\target\release\deps\libregex_syntax-cdbf23afbbe7ff91.rmeta --extern reqwest=C:\Users\brand17\AppData\Local\Temp\pip-install-y0fjj6wa\tokenizers_2db86e04c1fa4183838a80e6696d7401\target\release\deps\libreqwest-59aa5484c48418bc.rmeta --extern serde=C:\Users\brand17\AppData\Local\Temp\pip-install-y0fjj6wa\tokenizers_2db86e04c1fa4183838a80e6696d7401\target\release\deps\libserde-427adfef4d6aa36a.rmeta --extern serde_json=C:\Users\brand17\AppData\Local\Temp\pip-install-y0fjj6wa\tokenizers_2db86e04c1fa4183838a80e6696d7401\target\release\deps\libserde_json-3a4a381d8ad944f7.rmeta --extern spm_precompiled=C:\Users\brand17\AppData\Local\Temp\pip-install-y0fjj6wa\tokenizers_2db86e04c1fa4183838a80e6696d7401\target\release\deps\libspm_precompiled-110c8e2fbaf991fb.rmeta --extern thiserror=C:\Users\brand17\AppData\Local\Temp\pip-install-y0fjj6wa\tokenizers_2db86e04c1fa4183838a80e6696d7401\target\release\deps\libthiserror-ddc194dcf4ff03d2.rmeta --extern unicode_normalization_alignments=C:\Users\brand17\AppData\Local\Temp\pip-install-y0fjj6wa\tokenizers_2db86e04c1fa4183838a80e6696d7401\target\release\deps\libunicode_normalization_alignments-3f364afba5614e5b.rmeta --extern unicode_segmentation=C:\Users\brand17\AppData\Local\Temp\pip-install-y0fjj6wa\tokenizers_2db86e04c1fa4183838a80e6696d7401\target\release\deps\libunicode_segmentation-bf2354f63ed11988.rmeta --extern unicode_categories=C:\Users\brand17\AppData\Local\Temp\pip-install-y0fjj6wa\tokenizers_2db86e04c1fa4183838a80e6696d7401\target\release\deps\libunicode_categories-93a00f926fbabe0c.rmeta -L native=C:\Users\brand17\.cargo\registry\src\index.crates.io-6f17d22bba15001f\windows_x86_64_msvc-0.52.5\lib -L native=C:\Users\brand17\.cargo\registry\src\index.crates.io-6f17d22bba15001f\windows_x86_64_msvc-0.48.5\lib -L "native=C:\Program Files\Microsoft Visual Studio\2022\Community\VC\Tools\MSVC\14.35.32215\atlmfc\lib\x64" -L native=C:\Users\brand17\AppData\Local\Temp\pip-install-y0fjj6wa\tokenizers_2db86e04c1fa4183838a80e6696d7401\target\release\build\bzip2-sys-cf42efa254fcae54\out\lib -L "native=C:\Program Files\Microsoft Visual Studio\2022\Community\VC\Tools\MSVC\14.35.32215\atlmfc\lib\x64" -L native=C:\Users\brand17\AppData\Local\Temp\pip-install-y0fjj6wa\tokenizers_2db86e04c1fa4183838a80e6696d7401\target\release\build\zstd-sys-4a283d60bc586915\out -L "native=C:\Program Files\Microsoft Visual Studio\2022\Community\VC\Tools\MSVC\14.35.32215\atlmfc\lib\x64" -L native=C:\Users\brand17\AppData\Local\Temp\pip-install-y0fjj6wa\tokenizers_2db86e04c1fa4183838a80e6696d7401\target\release\build\esaxx-rs-ea1668614ff983cd\out -L "native=C:\Program Files\Microsoft Visual Studio\2022\Community\VC\Tools\MSVC\14.35.32215\atlmfc\lib\x64" -L native=C:\Users\brand17\AppData\Local\Temp\pip-install-y0fjj6wa\tokenizers_2db86e04c1fa4183838a80e6696d7401\target\release\build\onig_sys-eab5e426da51abba\out` (exit code: 1)  
      warning: build failed, waiting for other jobs to finish...
      error: `cargo rustc --lib --message-format=json-render-diagnostics --manifest-path Cargo.toml --release -v --features pyo3/extension-module --crate-type cdylib --` failed with code 101
      [end of output]

  note: This error originates from a subprocess, and is likely not a problem with pip.       
  ERROR: Failed building wheel for tokenizers
Failed to build tokenizers
ERROR: Could not build wheels for tokenizers, which is required to install pyproject.toml-based projects```
amyeroberts commented 6 months ago

Hi @brand17, thanks for sharing. This looks like a rust issue and tokenizers issue, so not something we'd be able to address in transformers. Have you tried just install tokenizers directly: pip install tokenizers?

brand17 commented 6 months ago

Hi, @amyeroberts. If I install tokenizers and then pip install 'transformers[tf-cpu]' then I get the same error. Tried to install tokenizers, transformers, tf-keras, tensorflow_probability. But getting another bug.

pip install transformers[torch] works fine.

amyeroberts commented 6 months ago

cc @Rocketknight1

Rocketknight1 commented 6 months ago

transformers[tf-cpu] should add the following packages versus a base transformers install: ["tensorflow-cpu", "onnxconverter-common", "tf2onnx", "tensorflow-text", "keras-nlp"]

I can't figure this one out at all, though - none of those packages should introduce tokenizers dependencies that would break compilation. If I had to guess, I'd point my finger at the ONNX dependencies, but I couldn't reproduce the issue on a Windows x64 machine here!

prmbittencourt commented 6 months ago

Today I tried launching the StableDiffusion Web UI and got the same error as it updated dependencies upon launching. On EdeavourOS, fully updated. SD was working perfectly yesterday. I tried replacing the rustup package in the Arch repos with the rustpakage and finally installed rustupdirectly from rustup.rs but it didn't help.

EDIT: I solved the problem by using Python 3.10 instead of 3.12.

github-actions[bot] commented 5 months ago

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

Please note that issues that do not follow the contributing guidelines are likely to be ignored.

dbl001 commented 4 months ago

I got this error building transformers from source.

% git clone https://github.com/huggingface/transformers
% git status      
On branch master
Your branch is up to date with 'origin/master'.

Untracked files:
  (use "git add <file>..." to include in what will be committed)
    examples/tensorflow/text-classification/data_to_predict.json

nothing added to commit but untracked files present (use "git add" to track)
% pip install -e .
...
     error: casting `&T` to `&mut T` is undefined behavior, even if the reference is unused, consider instead using an `UnsafeCell`
         --> tokenizers-lib/src/models/bpe/trainer.rs:517:47
          |
      513 |                     let w = &words[*i] as *const _ as *mut _;
          |                             -------------------------------- casting happend here
      ...
      517 |                         let word: &mut Word = &mut (*w);
          |                                               ^^^^^^^^^
          |
          = note: for more information, visit <https://doc.rust-lang.org/book/ch15-05-interior-mutability.html>
          = note: `#[deny(invalid_reference_casting)]` on by default
themanyone commented 4 months ago

EDIT: I solved the problem by using Python 3.10 instead of 3.12.

Good idea. stable-diffusionn-webui can be forced to proceed past that step with python 3.12 if you bump the version requirements for transformers to transformers>=4.30.2 in requirements.txt and requirements_versions.txt. But then it runs into other problems with parsing the model .yaml files due to the updated tokenizers. stable-diffusionn-webui needs more updates to work with python 3.12.