kohya-ss / sd-scripts

Apache License 2.0
5.31k stars 880 forks source link

OSError: Can't load tokenizer for 'google/t5-v1_1-xxl' #1689

Closed hydra-bu closed 1 month ago

hydra-bu commented 1 month ago

my command:

accelerate launch \
  --mixed_precision bf16 \
  --num_cpu_threads_per_process 1 \
  sd-scripts/flux_train_network.py \
  --pretrained_model_name_or_path /app/fluxgym/models/unet/flux1-dev.sft \
  --clip_l /app/fluxgym/models/clip/clip_l.safetensors \
  --t5xxl /app/fluxgym/models/clip/t5xxl_fp16.safetensors \
  --ae /app/fluxgym/models/vae/ae.sft \
  --cache_latents_to_disk \
  --save_model_as safetensors \
  --sdpa \
  --persistent_data_loader_workers \
  --max_data_loader_n_workers 2 \
  --seed 42 \
  --gradient_checkpointing \
  --mixed_precision bf16 \
  --save_precision bf16 \
  --network_module networks.lora_flux \
  --network_dim 4 \
  --optimizer_type adafactor \
  --optimizer_args relative_step=False scale_parameter=False warmup_init=False \
  --lr_scheduler constant_with_warmup \
  --max_grad_norm 0.0 \
  --sample_prompts /app/fluxgym/outputs/chinese-girl/sample_prompts.txt \
  --sample_every_n_steps 100 \
  --learning_rate 8e-4 \
  --cache_text_encoder_outputs \
  --cache_text_encoder_outputs_to_disk \
  --fp8_base \
  --highvram \
  --max_train_epochs 16 \
  --save_every_n_epochs 4 \
  --dataset_config /app/fluxgym/outputs/chinese-girl/dataset.toml \
  --output_dir /app/fluxgym/outputs/chinese-girl \
  --output_name chinese-girl \
  --timestep_sampling shift \
  --discrete_flow_shift 3.1582 \
  --model_prediction_type raw \
  --guidance_scale 1 \
  --loss_type l2

error logs:

Traceback (most recent call last):
  File "/app/fluxgym/sd-scripts/flux_train_network.py", line 519, in <module>
    trainer.train(args)
  File "/app/fluxgym/sd-scripts/train_network.py", line 268, in train
    tokenize_strategy = self.get_tokenize_strategy(args)
  File "/app/fluxgym/sd-scripts/flux_train_network.py", line 153, in get_tokenize_strategy
    return strategy_flux.FluxTokenizeStrategy(t5xxl_max_token_length, args.tokenizer_cache_dir)
  File "/app/fluxgym/sd-scripts/library/strategy_flux.py", line 27, in __init__
    self.t5xxl = self._load_tokenizer(T5TokenizerFast, T5_XXL_TOKENIZER_ID, tokenizer_cache_dir=tokenizer_cache_dir)
  File "/app/fluxgym/sd-scripts/library/strategy_base.py", line 65, in _load_tokenizer
    tokenizer = model_class.from_pretrained(model_id, subfolder=subfolder)
  File "/usr/local/lib/python3.10/dist-packages/transformers/tokenization_utils_base.py", line 2255, in from_pretrained
    raise EnvironmentError(
OSError: Can't load tokenizer for 'google/t5-v1_1-xxl'. If you were trying to load it from 'https://huggingface.co/models', make sure you don't have a local directory with the same name. Otherwise, make sure 'google/t5-v1_1-xxl' is the correct path to a directory containing all relevant files for a T5TokenizerFast tokenizer.
Traceback (most recent call last):
  File "/usr/local/bin/accelerate", line 8, in <module>
    sys.exit(main())
  File "/usr/local/lib/python3.10/dist-packages/accelerate/commands/accelerate_cli.py", line 48, in main
    args.func(args)
  File "/usr/local/lib/python3.10/dist-packages/accelerate/commands/launch.py", line 1106, in launch_command
    simple_launcher(args)
  File "/usr/local/lib/python3.10/dist-packages/accelerate/commands/launch.py", line 704, in simple_launcher
    raise subprocess.CalledProcessError(returncode=process.returncode, cmd=cmd)
subprocess.CalledProcessError: Command '['/usr/bin/python3', 'sd-scripts/flux_train_network.py', '--pretrained_model_name_or_path', '/app/fluxgym/models/unet/flux1-dev.sft', '--clip_l', '/app/fluxgym/models/clip/clip_l.safetensors', '--t5xxl', '/app/fluxgym/models/clip/t5xxl_fp16.safetensors', '--ae', '/app/fluxgym/models/vae/ae.sft', '--cache_latents_to_disk', '--save_model_as', 'safetensors', '--sdpa', '--persistent_data_loader_workers', '--max_data_loader_n_workers', '2', '--seed', '42', '--gradient_checkpointing', '--mixed_precision', 'bf16', '--save_precision', 'bf16', '--network_module', 'networks.lora_flux', '--network_dim', '4', '--optimizer_type', 'adafactor', '--optimizer_args', 'relative_step=False', 'scale_parameter=False', 'warmup_init=False', '--lr_scheduler', 'constant_with_warmup', '--max_grad_norm', '0.0', '--sample_prompts', '/app/fluxgym/outputs/chinese-girl/sample_prompts.txt', '--sample_every_n_steps', '100', '--learning_rate', '8e-4', '--cache_text_encoder_outputs', '--cache_text_encoder_outputs_to_disk', '--fp8_base', '--highvram', '--max_train_epochs', '16', '--save_every_n_epochs', '4', '--dataset_config', '/app/fluxgym/outputs/chinese-girl/dataset.toml', '--output_dir', '/app/fluxgym/outputs/chinese-girl', '--output_name', 'chinese-girl', '--timestep_sampling', 'shift', '--discrete_flow_shift', '3.1582', '--model_prediction_type', 'raw', '--guidance_scale', '1', '--loss_type', 'l2']' returned non-zero exit status 1.
kohya-ss commented 1 month ago

Internet connection is required to load tokenizer.

hydra-bu commented 1 month ago

Internet connection is required to load tokenizer.

Thank you for your reply. The issue has been resolved; it wasn't due to network connectivity. I was using Docker for deployment, but switching to direct deployment on the host machine fixed the problem.