Error creating pre-trained FluxPipeline with diffusers "Cannot instantiate this tokenizer from a slow version"

pcgeek86 commented 3 months ago

Describe the bug

I'm receiving an error trying to use the diffusers module to run the flux model.

ValueError: Cannot instantiate this tokenizer from a slow version. If it's based on sentencepiece, make sure you have sentencepiece installed

https://huggingface.co/black-forest-labs/FLUX.1-schnell

Reproduction

Have a system with an NVIDIA GPU (ie. GeForce RTX 2080)
Install Docker Desktop on Windows 11
Run a new PyTorch container docker run --rm --interactive --tty --gpus=all pytorch/pytorch
Install git package: apt update; apt install git --yes;
Install Python dependencies: pip install transformers accelerate git+https://github.com/huggingface/diffusers.git
Run the code sample from this repository

import torch
from diffusers import FluxPipeline

pipe = FluxPipeline.from_pretrained("black-forest-labs/FLUX.1-schnell", torch_dtype=torch.bfloat16)
pipe.enable_model_cpu_offload() #save some VRAM by offloading the model to CPU. Remove this if you have enough GPU power

prompt = "A cat holding a sign that says hello world"
image = pipe(
    prompt,
    guidance_scale=0.0,
    output_type="pil",
    num_inference_steps=4,
    max_sequence_length=256,
    generator=torch.Generator("cpu").manual_seed(0)
).images[0]
image.save("flux-schnell.png")

Logs

>>> pipe = FluxPipeline.from_pretrained("black-forest-labs/FLUX.1-schnell", torch_dtype=torch.bfloat16)
Loading pipeline components...:  57%|██████████████████████████████████████████████████████████████████████████▊                                                        | 4/7 [00:00<00:00, 16.49it/s]
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/opt/conda/lib/python3.10/site-packages/huggingface_hub/utils/_validators.py", line 114, in _inner_fn
    return fn(*args, **kwargs)
  File "/opt/conda/lib/python3.10/site-packages/diffusers/pipelines/pipeline_utils.py", line 876, in from_pretrained
    loaded_sub_model = load_sub_model(
  File "/opt/conda/lib/python3.10/site-packages/diffusers/pipelines/pipeline_loading_utils.py", line 700, in load_sub_model
    loaded_sub_model = load_method(os.path.join(cached_folder, name), **loading_kwargs)
  File "/opt/conda/lib/python3.10/site-packages/transformers/tokenization_utils_base.py", line 2291, in from_pretrained
    return cls._from_pretrained(
  File "/opt/conda/lib/python3.10/site-packages/transformers/tokenization_utils_base.py", line 2525, in _from_pretrained
    tokenizer = cls(*init_inputs, **init_kwargs)
  File "/opt/conda/lib/python3.10/site-packages/transformers/models/t5/tokenization_t5_fast.py", line 119, in __init__
    super().__init__(
  File "/opt/conda/lib/python3.10/site-packages/transformers/tokenization_utils_fast.py", line 106, in __init__
    raise ValueError(
ValueError: Cannot instantiate this tokenizer from a slow version. If it's based on sentencepiece, make sure you have sentencepiece installed.


### System Info

🤗 Diffusers version: 0.30.0.dev0
Platform: Linux-5.15.153.1-microsoft-standard-WSL2-x86_64-with-glibc2.35
Running on Google Colab?: No
Python version: 3.10.13
PyTorch version (GPU?): 2.2.1 (True)
Flax version (CPU?/GPU?/TPU?): not installed (NA)
Jax version: not installed
JaxLib version: not installed
Huggingface_hub version: 0.24.5
Transformers version: 4.43.3
Accelerate version: 0.33.0
PEFT version: not installed
Bitsandbytes version: not installed
Safetensors version: 0.4.3
xFormers version: not installed
Accelerator: NVIDIA GeForce RTX 2080, 8192 MiB
Using GPU in script?: Yes
Using distributed or parallel set-up in script?: No

Who can help?

No response

wrapss commented 3 months ago

try pip install sentencepiece ?

pcgeek86 commented 3 months ago

try pip install sentencepiece ?

That was the first thing I did, but it had zero effect.

jbaron34 commented 3 months ago

I have the same issue, tried both pip install sentencepiece and pip install transformers[sentencepiece] and no change.

yiyixuxu commented 3 months ago

are you able to run it this way?

import torch
from diffusers import FluxPipeline
from transformers import CLIPTokenizer

tokenizer = CLIPTokenizer.from_pretrained("openai/clip-vit-large-patch14", torch_dtype=torch.bfloat16)

pipe = FluxPipeline.from_pretrained("black-forest-labs/FLUX.1-schnell",tokenizer=tokenizer,  torch_dtype=torch.bfloat16)
pipe.enable_model_cpu_offload() #save some VRAM by offloading the model to CPU. Remove this if you have enough GPU power

prompt = "A cat holding a sign that says hello world"
image = pipe(
    prompt,
    guidance_scale=0.0,
    output_type="pil",
    num_inference_steps=4,
    max_sequence_length=256,
    generator=torch.Generator("cpu").manual_seed(0)
).images[0]
image.save("flux-schnell.png")

jbaron34 commented 3 months ago

That gives me the same error. In fact the way I have been trying to do it is:

from diffusers import FluxPipeline, AutoencoderKL
from diffusers.image_processor import VaeImageProcessor
from transformers import T5EncoderModel, T5TokenizerFast, CLIPTokenizer, CLIPTextModel
import torch

ckpt_id = "black-forest-labs/FLUX.1-schnell"

denoise_pipeline = FluxPipeline.from_pretrained(
    ckpt_id,
    text_encoder=None,
    text_encoder_2=None,
    tokenizer=None,
    tokenizer_2=None,
    vae=None,
    torch_dtype=torch.bfloat16,
).to("cpu")

prompt_pipeline = FluxPipeline.from_pretrained(
    ckpt_id,
    text_encoder=CLIPTextModel.from_pretrained(ckpt_id, subfolder="text_encoder", torch_dtype=torch.bfloat16),
    text_encoder_2=T5EncoderModel.from_pretrained(ckpt_id, subfolder="text_encoder_2", torch_dtype=torch.bfloat16),
    tokenizer=CLIPTokenizer.from_pretrained(ckpt_id, subfolder="tokenizer"),
    tokenizer_2=T5TokenizerFast.from_pretrained(ckpt_id, subfolder="tokenizer_2"),
    transformer=None,
    vae=None,
).to("cpu")

vae = AutoencoderKL.from_pretrained(ckpt_id, subfolder="vae", torch_dtype=torch.bfloat16).to("cpu")
vae_scale_factor = 2 ** (len(vae.config.block_out_channels))
image_processor = VaeImageProcessor(vae_scale_factor=vae_scale_factor)

and the error is apparently caused by the T5 tokenizer:

You set `add_prefix_space`. The tokenizer needs to be converted from the slow tokenizers
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
Cell In[6], line 18
      1 ckpt_id = "black-forest-labs[/FLUX.1-schnell](http://localhost:8888/FLUX.1-schnell)"
      3 denoise_pipeline = FluxPipeline.from_pretrained(
      4     ckpt_id,
      5     text_encoder=None,
   (...)
     10     torch_dtype=torch.bfloat16,
     11 ).to("cpu")
     13 prompt_pipeline = FluxPipeline.from_pretrained(
     14     ckpt_id,
     15     text_encoder=CLIPTextModel.from_pretrained(ckpt_id, subfolder="text_encoder", torch_dtype=torch.bfloat16),
     16     text_encoder_2=T5EncoderModel.from_pretrained(ckpt_id, subfolder="text_encoder_2", torch_dtype=torch.bfloat16),
     17     tokenizer=CLIPTokenizer.from_pretrained(ckpt_id, subfolder="tokenizer"),
---> 18     tokenizer_2=T5TokenizerFast.from_pretrained(ckpt_id, subfolder="tokenizer_2"),
     19     transformer=None,
     20     vae=None,
     21 ).to("cpu")
     23 vae = AutoencoderKL.from_pretrained(ckpt_id, subfolder="vae", torch_dtype=torch.bfloat16).to("cpu")
     24 vae_scale_factor = 2 ** (len(vae.config.block_out_channels))

File ~/notebooks/.venv/lib/python3.10/site-packages/transformers/tokenization_utils_base.py:2291, in PreTrainedTokenizerBase.from_pretrained(cls, pretrained_model_name_or_path, cache_dir, force_download, local_files_only, token, revision, trust_remote_code, *init_inputs, **kwargs)
   2288     else:
   2289         logger.info(f"loading file {file_path} from cache at {resolved_vocab_files[file_id]}")
-> 2291 return cls._from_pretrained(
   2292     resolved_vocab_files,
   2293     pretrained_model_name_or_path,
   2294     init_configuration,
   2295     *init_inputs,
   2296     token=token,
   2297     cache_dir=cache_dir,
   2298     local_files_only=local_files_only,
   2299     _commit_hash=commit_hash,
   2300     _is_local=is_local,
   2301     trust_remote_code=trust_remote_code,
   2302     **kwargs,
   2303 )

File ~/notebooks/.venv/lib/python3.10/site-packages/transformers/tokenization_utils_base.py:2525, in PreTrainedTokenizerBase._from_pretrained(cls, resolved_vocab_files, pretrained_model_name_or_path, init_configuration, token, cache_dir, local_files_only, _commit_hash, _is_local, trust_remote_code, *init_inputs, **kwargs)
   2523 # Instantiate the tokenizer.
   2524 try:
-> 2525     tokenizer = cls(*init_inputs, **init_kwargs)
   2526 except OSError:
   2527     raise OSError(
   2528         "Unable to load vocabulary from file. "
   2529         "Please check that the provided vocabulary is accessible and not corrupted."
   2530     )

File ~/notebooks/.venv/lib/python3.10/site-packages/transformers/models/t5/tokenization_t5_fast.py:119, in T5TokenizerFast.__init__(self, vocab_file, tokenizer_file, eos_token, unk_token, pad_token, extra_ids, additional_special_tokens, add_prefix_space, **kwargs)
    114     logger.warning_once(
    115         "You set `add_prefix_space`. The tokenizer needs to be converted from the slow tokenizers"
    116     )
    117     kwargs["from_slow"] = True
--> 119 super().__init__(
    120     vocab_file,
    121     tokenizer_file=tokenizer_file,
    122     eos_token=eos_token,
    123     unk_token=unk_token,
    124     pad_token=pad_token,
    125     extra_ids=extra_ids,
    126     additional_special_tokens=additional_special_tokens,
    127     **kwargs,
    128 )
    130 self.vocab_file = vocab_file
    131 self._extra_ids = extra_ids

File ~/notebooks/.venv/lib/python3.10/site-packages/transformers/tokenization_utils_fast.py:106, in PreTrainedTokenizerFast.__init__(self, *args, **kwargs)
    103 added_tokens_decoder = kwargs.pop("added_tokens_decoder", {})
    105 if from_slow and slow_tokenizer is None and self.slow_tokenizer_class is None:
--> 106     raise ValueError(
    107         "Cannot instantiate this tokenizer from a slow version. If it's based on sentencepiece, make sure you "
    108         "have sentencepiece installed."
    109     )
    111 if tokenizer_object is not None:
    112     fast_tokenizer = copy.deepcopy(tokenizer_object)

ValueError: Cannot instantiate this tokenizer from a slow version. If it's based on sentencepiece, make sure you have sentencepiece installed.

yiyixuxu commented 3 months ago

do you have sentencepice in your environment, e.g. does this work?

import sentencepiece

iwaitu commented 3 months ago

pip install protobuf

jbaron34 commented 3 months ago

My issue turned out to be the way I was running jupyter in a virtualenv. Thanks for your help

yiyixuxu commented 3 months ago

@pcgeek86 were you also able to resolve the issue?

pcgeek86 commented 3 months ago

@pcgeek86 were you also able to resolve the issue?

I am getting CUDA out of memory errors, but I think it's working with your code sample. Still seems like maybe a bug in the original code sample I shared, from the link?

github-actions[bot] commented 2 months ago

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

Please note that issues that do not follow the contributing guidelines are likely to be ignored.

culda commented 2 months ago

My issue turned out to be the way I was running jupyter in a virtualenv. Thanks for your help

@jbaron34 what tokenizer worked in the end. Trying to run Flex1 in a huggingface space with no luck yet

github-actions[bot] commented 1 month ago

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

Please note that issues that do not follow the contributing guidelines are likely to be ignored.

YUANMU227 commented 1 month ago

pip install sentencepiece

it worked

github-actions[bot] commented 2 weeks ago

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

Please note that issues that do not follow the contributing guidelines are likely to be ignored.

huggingface / diffusers