Closed 311-code closed 7 months ago
This is unlikely a bug. Please post your configuration you're using when you get this error.
And please also add your full error log
Ok, I'll be back in front of pc later tonight, will do.
Ok here is the error and full config. I have 4090 24gb:
activating venv C:\OneTrainer2\venv
A matching Triton is not available, some optimizations will not be enabled.
Error caught was: No module named 'triton'
C:\OneTrainer2\venv\src\diffusers\src\diffusers\models\lora.py:300: FutureWarning: `LoRACompatibleConv` is deprecated and will be removed in version 1.0.0. Use of `LoRACompatibleConv` is deprecated. Please switch to PEFT backend by installing PEFT: `pip install peft`.
deprecate("LoRACompatibleConv", "1.0.0", deprecation_message)
C:\OneTrainer2\venv\src\diffusers\src\diffusers\models\lora.py:384: FutureWarning: `LoRACompatibleLinear` is deprecated and will be removed in version 1.0.0. Use of `LoRACompatibleLinear` is deprecated. Please switch to PEFT backend by installing PEFT: `pip install peft`.
deprecate("LoRACompatibleLinear", "1.0.0", deprecation_message)
Traceback (most recent call last):
File "C:\OneTrainer2\modules\modelLoader\StableDiffusionXLModelLoader.py", line 275, in load
model = self.__load_internal(model_type, weight_dtypes, model_names.base_model, model_names.vae_model)
File "C:\OneTrainer2\modules\modelLoader\StableDiffusionXLModelLoader.py", line 56, in __load_internal
with open(os.path.join(base_model_name, "meta.json"), "r") as meta_file:
FileNotFoundError: [Errno 2] No such file or directory: 'C:/stable-diffusion-webui-master/models/Stable-diffusion/memodel.safetensors\\meta.json'
Traceback (most recent call last):
File "C:\OneTrainer2\modules\modelLoader\StableDiffusionXLModelLoader.py", line 282, in load
model = self.__load_diffusers(model_type, weight_dtypes, model_names.base_model, model_names.vae_model)
File "C:\OneTrainer2\modules\modelLoader\StableDiffusionXLModelLoader.py", line 97, in __load_diffusers
tokenizer_1 = CLIPTokenizer.from_pretrained(
File "C:\OneTrainer2\venv\lib\site-packages\transformers\tokenization_utils_base.py", line 1925, in from_pretrained
raise ValueError(
ValueError: Calling CLIPTokenizer.from_pretrained() with the path to a single file or url is not supported for this tokenizer. Use a model identifier or the path to a directory instead.
Traceback (most recent call last):
File "C:\OneTrainer2\venv\src\diffusers\src\diffusers\configuration_utils.py", line 428, in load_config
config_dict = cls._dict_from_json_file(config_file)
File "C:\OneTrainer2\venv\src\diffusers\src\diffusers\configuration_utils.py", line 550, in _dict_from_json_file
text = reader.read()
File "C:\Users\NewPC\AppData\Local\Programs\Python\Python310\lib\codecs.py", line 322, in decode
(result, consumed) = self._buffer_decode(data, self.errors, final)
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xbf in position 25705: invalid start byte
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "C:\OneTrainer2\modules\modelLoader\StableDiffusionXLModelLoader.py", line 289, in load
model = self.__load_safetensors(model_type, weight_dtypes, model_names.base_model, model_names.vae_model)
File "C:\OneTrainer2\modules\modelLoader\StableDiffusionXLModelLoader.py", line 237, in __load_safetensors
pipeline.vae = AutoencoderKL.from_pretrained(
File "C:\OneTrainer2\venv\lib\site-packages\huggingface_hub\utils\_validators.py", line 118, in _inner_fn
return fn(*args, **kwargs)
File "C:\OneTrainer2\venv\src\diffusers\src\diffusers\models\modeling_utils.py", line 569, in from_pretrained
config, unused_kwargs, commit_hash = cls.load_config(
File "C:\OneTrainer2\venv\lib\site-packages\huggingface_hub\utils\_validators.py", line 118, in _inner_fn
return fn(*args, **kwargs)
File "C:\OneTrainer2\venv\src\diffusers\src\diffusers\configuration_utils.py", line 432, in load_config
raise EnvironmentError(f"It looks like the config file at '{config_file}' is not a valid JSON file.")
OSError: It looks like the config file at 'D:/stable-diffusion-webui-master/models/VAE/sdxl.vae.safetensors' is not a valid JSON file.
Traceback (most recent call last):
File "C:\OneTrainer2\venv\src\diffusers\src\diffusers\configuration_utils.py", line 428, in load_config
config_dict = cls._dict_from_json_file(config_file)
File "C:\OneTrainer2\venv\src\diffusers\src\diffusers\configuration_utils.py", line 550, in _dict_from_json_file
text = reader.read()
File "C:\Users\NewPC\AppData\Local\Programs\Python\Python310\lib\codecs.py", line 322, in decode
(result, consumed) = self._buffer_decode(data, self.errors, final)
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xbf in position 25705: invalid start byte
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "C:\OneTrainer2\modules\modelLoader\StableDiffusionXLModelLoader.py", line 296, in load
model = self.__load_ckpt(model_type, weight_dtypes, model_names.base_model, model_names.vae_model)
File "C:\OneTrainer2\modules\modelLoader\StableDiffusionXLModelLoader.py", line 186, in __load_ckpt
pipeline.vae = AutoencoderKL.from_pretrained(
File "C:\OneTrainer2\venv\lib\site-packages\huggingface_hub\utils\_validators.py", line 118, in _inner_fn
return fn(*args, **kwargs)
File "C:\OneTrainer2\venv\src\diffusers\src\diffusers\models\modeling_utils.py", line 569, in from_pretrained
config, unused_kwargs, commit_hash = cls.load_config(
File "C:\OneTrainer2\venv\lib\site-packages\huggingface_hub\utils\_validators.py", line 118, in _inner_fn
return fn(*args, **kwargs)
File "C:\OneTrainer2\venv\src\diffusers\src\diffusers\configuration_utils.py", line 432, in load_config
raise EnvironmentError(f"It looks like the config file at '{config_file}' is not a valid JSON file.")
OSError: It looks like the config file at 'D:/stable-diffusion-webui-master/models/VAE/sdxl.vae.safetensors' is not a valid JSON file.
Traceback (most recent call last):
File "C:\OneTrainer2\modules\ui\TrainUI.py", line 477, in __training_thread_function
trainer.start()
File "C:\OneTrainer2\modules\trainer\GenericTrainer.py", line 118, in start
self.model = self.model_loader.load(
File "C:\OneTrainer2\modules\modelLoader\StableDiffusionXLModelLoader.py", line 304, in load
raise Exception("could not load model: " + model_names.base_model)
Exception: could not load model: C:/stable-diffusion-webui-master/models/Stable-diffusion/memodel.safetensors
Config:
{
"__version": 2,
"training_method": "FINE_TUNE",
"model_type": "STABLE_DIFFUSION_XL_10_BASE",
"debug_mode": false,
"debug_dir": "C:/stable-diffusion-webui-master/outputs/memodel-sdxl/debug",
"workspace_dir": "C:/stable-diffusion-webui-master/outputs/memodel-sdxl/workspace",
"cache_dir": "C:/stable-diffusion-webui-master/outputs/memodel-sdxl/cache",
"tensorboard": false,
"tensorboard_expose": false,
"continue_last_backup": true,
"include_train_config": "NONE",
"base_model_name": "C:/stable-diffusion-webui-master/models/Stable-diffusion/memodel.safetensors",
"weight_dtype": "BFLOAT_16",
"output_dtype": "BFLOAT_16",
"output_model_format": "SAFETENSORS",
"output_model_destination": "C:/stable-diffusion-webui-master/outputs/memodel-sdxl/output/memodel_trained.safetensors",
"gradient_checkpointing": true,
"concept_file_name": "training_concepts/concepts.json",
"concepts": null,
"circular_mask_generation": false,
"random_rotate_and_crop": false,
"aspect_ratio_bucketing": false,
"latent_caching": true,
"clear_cache_before_training": false,
"learning_rate_scheduler": "CONSTANT",
"learning_rate": 1e-05,
"learning_rate_warmup_steps": 200,
"learning_rate_cycles": 1,
"epochs": 200,
"batch_size": 3,
"gradient_accumulation_steps": 1,
"ema": "OFF",
"ema_decay": 0.999,
"ema_update_step_interval": 1,
"train_device": "cuda",
"temp_device": "cpu",
"train_dtype": "BFLOAT_16",
"fallback_train_dtype": "FLOAT_32",
"only_cache": false,
"resolution": "1024",
"attention_mechanism": "DEFAULT",
"align_prop": false,
"align_prop_probability": 0.1,
"align_prop_loss": "AESTHETIC",
"align_prop_weight": 0.01,
"align_prop_steps": 20,
"align_prop_truncate_steps": 0.5,
"align_prop_cfg_scale": 7.0,
"mse_strength": 1.0,
"mae_strength": 0.0,
"vb_loss_strength": 1.0,
"min_snr_gamma": 0.0,
"dropout_probability": 0.0,
"loss_scaler": "NONE",
"learning_rate_scaler": "NONE",
"offset_noise_weight": 0.0,
"perturbation_noise_weight": 0.0,
"rescale_noise_scheduler_to_zero_terminal_snr": false,
"force_v_prediction": false,
"force_epsilon_prediction": false,
"min_noising_strength": 0.0,
"max_noising_strength": 1.0,
"noising_weight": 0.0,
"noising_bias": 0.5,
"unet": {
"__version": 0,
"model_name": "",
"train": true,
"stop_training_after": 10000,
"stop_training_after_unit": "EPOCH",
"learning_rate": 1e-05,
"weight_dtype": "BFLOAT_16"
},
"prior": {
"__version": 0,
"model_name": "",
"train": true,
"stop_training_after": 10000,
"stop_training_after_unit": "EPOCH",
"learning_rate": null,
"weight_dtype": "NONE"
},
"text_encoder": {
"__version": 0,
"model_name": "",
"train": true,
"stop_training_after": 200,
"stop_training_after_unit": "EPOCH",
"learning_rate": 3e-06,
"weight_dtype": "BFLOAT_16"
},
"text_encoder_layer_skip": 0,
"text_encoder_2": {
"__version": 0,
"model_name": "",
"train": false,
"stop_training_after": 30,
"stop_training_after_unit": "EPOCH",
"learning_rate": null,
"weight_dtype": "BFLOAT_16"
},
"text_encoder_2_layer_skip": 0,
"vae": {
"__version": 0,
"model_name": "D:/stable-diffusion-webui-master/models/VAE/sdxl.vae.safetensors",
"train": true,
"stop_training_after": null,
"stop_training_after_unit": "NEVER",
"learning_rate": null,
"weight_dtype": "FLOAT_32"
},
"effnet_encoder": {
"__version": 0,
"model_name": "",
"train": true,
"stop_training_after": null,
"stop_training_after_unit": "NEVER",
"learning_rate": null,
"weight_dtype": "NONE"
},
"decoder": {
"__version": 0,
"model_name": "",
"train": true,
"stop_training_after": null,
"stop_training_after_unit": "NEVER",
"learning_rate": null,
"weight_dtype": "NONE"
},
"decoder_text_encoder": {
"__version": 0,
"model_name": "",
"train": true,
"stop_training_after": null,
"stop_training_after_unit": "NEVER",
"learning_rate": null,
"weight_dtype": "NONE"
},
"decoder_vqgan": {
"__version": 0,
"model_name": "",
"train": true,
"stop_training_after": null,
"stop_training_after_unit": "NEVER",
"learning_rate": null,
"weight_dtype": "NONE"
},
"masked_training": false,
"unmasked_probability": 0.1,
"unmasked_weight": 0.1,
"normalize_masked_area_loss": false,
"embeddings": [
{
"__version": 0,
"model_name": "",
"train": true,
"stop_training_after": null,
"stop_training_after_unit": "NEVER",
"token_count": 1,
"initial_embedding_text": "*",
"weight_dtype": "FLOAT_32"
}
],
"embedding_weight_dtype": "FLOAT_32",
"lora_model_name": "",
"lora_rank": 16,
"lora_alpha": 1.0,
"lora_weight_dtype": "FLOAT_32",
"optimizer": {
"__version": 0,
"optimizer": "ADAFACTOR",
"adam_w_mode": false,
"alpha": null,
"amsgrad": false,
"beta1": null,
"beta2": null,
"beta3": null,
"bias_correction": false,
"block_wise": false,
"capturable": false,
"centered": false,
"clip_threshold": 1.0,
"d0": null,
"d_coef": null,
"dampening": null,
"decay_rate": -0.8,
"decouple": false,
"differentiable": false,
"eps": 1e-30,
"eps2": 0.001,
"foreach": false,
"fsdp_in_use": false,
"fused": false,
"growth_rate": null,
"initial_accumulator_value": null,
"is_paged": false,
"log_every": null,
"lr_decay": null,
"max_unorm": null,
"maximize": false,
"min_8bit_size": null,
"momentum": null,
"nesterov": false,
"no_prox": false,
"optim_bits": null,
"percentile_clipping": null,
"relative_step": false,
"safeguard_warmup": false,
"scale_parameter": false,
"stochastic_rounding": true,
"use_bias_correction": false,
"use_triton": false,
"warmup_init": false,
"weight_decay": 0.0
},
"optimizer_defaults": {
"ADAFACTOR": {
"__version": 0,
"optimizer": "ADAFACTOR",
"adam_w_mode": false,
"alpha": null,
"amsgrad": false,
"beta1": null,
"beta2": null,
"beta3": null,
"bias_correction": false,
"block_wise": false,
"capturable": false,
"centered": false,
"clip_threshold": 1.0,
"d0": null,
"d_coef": null,
"dampening": null,
"decay_rate": -0.8,
"decouple": false,
"differentiable": false,
"eps": 1e-30,
"eps2": 0.001,
"foreach": false,
"fsdp_in_use": false,
"fused": false,
"growth_rate": null,
"initial_accumulator_value": null,
"is_paged": false,
"log_every": null,
"lr_decay": null,
"max_unorm": null,
"maximize": false,
"min_8bit_size": null,
"momentum": null,
"nesterov": false,
"no_prox": false,
"optim_bits": null,
"percentile_clipping": null,
"relative_step": false,
"safeguard_warmup": false,
"scale_parameter": false,
"stochastic_rounding": true,
"use_bias_correction": false,
"use_triton": false,
"warmup_init": false,
"weight_decay": 0.0
}
},
"sample_definition_file_name": "training_samples/samples.json",
"samples": null,
"sample_after": 100,
"sample_after_unit": "STEP",
"sample_image_format": "JPG",
"samples_to_tensorboard": false,
"non_ema_sampling": false,
"backup_after": 1,
"backup_after_unit": "EPOCH",
"rolling_backup": false,
"rolling_backup_count": 30,
"backup_before_save": true,
"save_after": 30,
"save_after_unit": "EPOCH",
"save_filename_prefix": ""
}
I think this has something to do with the fp16 vae fix selected. I deleted the link to a vae and it's training now. I also tried changing it to float16 in the gui but same error. Any ideas with this added info? Would prefer to use this vae.
I use it because without it if you train over juggernaut v9 or merge jugg with my dreambooths it produces white orb artifacts and orange haze here is link to fp16 fix vae: https://huggingface.co/madebyollin/sdxl-vae-fp16-fix/tree/main
Okay, I resolved it. didn't realize the vae field seems to require diffusers format for the vae? (even though it has ability to select .ckpt and .safetensor file) Because I also tried using the original sdxl_vae.safetensors and it still gave that error.
If that's the case I would maybe suggest adding a note there about requiring diffusers and disable ability to select a .safetensor and .ckpt in that field.
Anyways, I just pasted in madebyollin/sdxl-vae-fp16-fix
(for anyone reading.. which in the models tab under vae) It downloads the vae then and works.
The only issue I have now is trying to have it train for 60 epochs and save every 1 epoch. I have all the settings set but it's still only training for 1299 steps with 74 photos.
So far this has been a much better experience than kohya ss gui. So thanks for this!
Edit: Nm it's working with the saving, I'm so used to kohya ss gui.. I didn't realize it just starts over on the next epoch. Omg, "continue from last backup" option is amazing here. It's going to save hours of time and easy to use. Thanks again.
What happened?
I have a custom local sdxl dreambooth model I wanted to train over which I made in kohya ss gui. I usually do this with kohya gui to train over them to improve some things, but wanted to try onetrainer out.
I have everything configured, but when I click start training it says:
FileNotFoundError: [Errno 2] No such file or directory: 'C:\Stable-diffusion-webui-master/models/Stable-Diffusion/memodel.safetensors\\meta.json
'ValueError: Calling CLIPTokenizer.from_pretrained() with the path to a single file or url is not supported for this tokenizer. Use a model identifier or the path to a directory instead.
UnicodeDecodeError: 'utf-8' codes can't decode byte 0xbf in position 25705: invalid start byte
During handing of the above exception, another exception occured:
Then a bunch of other errors as a result.
I do not have any configuration. jsons or meta.jsons anymore for these custom dreambooth as never had to use them with when training over them in kohya ss gui.
Any ideas how I can work around this? Just want to make sure you can train over custom models like some models on civitai that may not include configuration files, or if I can use a basic template one and where to put it.
I tried placing random config.json and model_index.json in the model's directory to see if it would do anything but it was same error.
Thanks!
What did you expect would happen?
That is would train over a custom model.
Relevant log output
Output of
pip freeze
Having trouble with this.