Closed Rika-Mipa closed 1 year ago
It looks like libarary/train_util.py is out of sync with https://github.com/kohya-ss/sd-scripts/blob/main/library/train_util.py Updating it seems to allow the training to go a bit further, (will check to see if this is the only out-of-sync dependency)
hello, i am glad to see your reply. My friend said he could not train after the update either. Wish you solve the problem. XD
MY PC: RTX 4090+WIN10 21H2 Python 3.10.9
Yesterday,the soft worked very well. However, when i update to the latest version today, it can not train any more. I delete the folder, and install the software from first step. However, it still couldn't work. Please help me.
Load CSS... Running on local URL: http://127.0.0.1:7860
To create a public link, set
loading vae:
Traceback (most recent call last):
File "D:\LORA\kohya_ss\venv\lib\site-packages\urllib3\connectionpool.py", line 700, in urlopen
self._prepare_proxy(conn)
File "D:\LORA\kohya_ss\venv\lib\site-packages\urllib3\connectionpool.py", line 996, in _prepare_proxy
conn.connect()
File "D:\LORA\kohya_ss\venv\lib\site-packages\urllib3\connection.py", line 414, in connect
self.sock = ssl_wrap_socket(
File "D:\LORA\kohyass\venv\lib\site-packages\urllib3\util\ssl.py", line 449, in ssl_wrap_socket
ssl_sock = _ssl_wrap_socket_impl(
File "D:\LORA\kohyass\venv\lib\site-packages\urllib3\util\ssl.py", line 493, in _ssl_wrap_socket_impl
return ssl_context.wrap_socket(sock, server_hostname=server_hostname)
File "D:\Python\lib\ssl.py", line 513, in wrap_socket
return self.sslsocket_class._create(
File "D:\Python\lib\ssl.py", line 1071, in _create
self.do_handshake()
File "D:\Python\lib\ssl.py", line 1342, in do_handshake
self._sslobj.do_handshake()
ssl.SSLEOFError: EOF occurred in violation of protocol (_ssl.c:997)
share=True
inlaunch()
. Loading config... Folder 7_Aharen: 567 steps max_train_steps = 11340 stop_text_encoder_training = 0 lr_warmup_steps = 567 accelerate launch --num_cpu_threads_per_process=32 "train_network.py" --enable_bucket --pretrained_model_name_or_path="D:/NovelAI/models/Stable-diffusion/latest.ckpt" --train_data_dir="E:/Train/Aharen" --resolution=512,512 --output_dir="E:/Train/Aharen" --logging_dir="" --network_alpha="8" --save_model_as=safetensors --network_module=networks.lora --text_encoder_lr=3e-5 --unet_lr=3e-4 --network_dim=32 --output_name="test" --lr_scheduler_num_cycles="20" --learning_rate="1e-5" --lr_scheduler="cosine_with_restarts" --lr_warmup_steps="567" --train_batch_size="1" --max_train_steps="11340" --save_every_n_epochs="1" --mixed_precision="fp16" --save_precision="fp16" --seed="31337" --caption_extension=".txt" --cache_latents --clip_skip=2 --keep_tokens="3" --bucket_reso_steps=64 --shuffle_caption --xformers --use_8bit_adam --bucket_no_upscale prepare tokenizer Use DreamBooth method. prepare train images. found directory 7_Aharen contains 81 image files 567 train images with repeating. loading image sizes. 100%|███████████████████████████████████████████████████████████████████████████████| 81/81 [00:00<00:00, 10125.13it/s] make buckets min_bucket_reso and max_bucket_reso are ignored if bucket_no_upscale is set, because bucket reso is defined by image size automatically / bucket_no_upscaleが指定された場合は、bucketの解像度は画像サイズから自動計算されるため、min_bucket_resoとmax_bucket_resoは無視されます number of images (including repeats) / 各bucketの画像枚数(繰り返し回数を含む) bucket 0: resolution (512, 384), count: 567 mean ar error (without repeats): 0.0 prepare accelerator Using accelerator 0.15.0 or above. load StableDiffusion checkpoint loading u-net:During handling of the above exception, another exception occurred:
Traceback (most recent call last): File "D:\LORA\kohya_ss\venv\lib\site-packages\requests\adapters.py", line 489, in send resp = conn.urlopen( File "D:\LORA\kohya_ss\venv\lib\site-packages\urllib3\connectionpool.py", line 787, in urlopen retries = retries.increment( File "D:\LORA\kohya_ss\venv\lib\site-packages\urllib3\util\retry.py", line 592, in increment raise MaxRetryError(_pool, url, error or ResponseError(cause)) urllib3.exceptions.MaxRetryError: HTTPSConnectionPool(host='huggingface.co', port=443): Max retries exceeded with url: /openai/clip-vit-large-patch14/resolve/main/pytorch_model.bin (Caused by SSLError(SSLEOFError(8, 'EOF occurred in violation of protocol (_ssl.c:997)')))
During handling of the above exception, another exception occurred:
Traceback (most recent call last): File "D:\LORA\kohya_ss\train_network.py", line 573, in
train(args)
File "D:\LORA\kohya_ss\train_network.py", line 158, in train
textencoder, vae, unet, = train_util.load_target_model(args, weight_dtype)
File "D:\LORA\kohya_ss\library\train_util.py", line 1584, in load_target_model
text_encoder, vae, unet = model_util.load_models_from_stable_diffusion_checkpoint(args.v2, args.pretrained_model_name_or_path)
File "D:\LORA\kohya_ss\library\model_util.py", line 919, in load_models_from_stable_diffusion_checkpoint
text_model = CLIPTextModel.from_pretrained("openai/clip-vit-large-patch14")
File "D:\LORA\kohya_ss\venv\lib\site-packages\transformers\modeling_utils.py", line 2222, in from_pretrained
resolved_archive_file = cached_file(
File "D:\LORA\kohya_ss\venv\lib\site-packages\transformers\utils\hub.py", line 409, in cached_file
resolved_file = hf_hub_download(
File "D:\LORA\kohya_ss\venv\lib\site-packages\huggingface_hub\utils_validators.py", line 124, in _inner_fn
return fn(*args, *kwargs)
File "D:\LORA\kohya_ss\venv\lib\site-packages\huggingface_hub\file_download.py", line 1105, in hf_hub_download
metadata = get_hf_file_metadata(
File "D:\LORA\kohya_ss\venv\lib\site-packages\huggingface_hub\utils_validators.py", line 124, in _inner_fn
return fn(args, kwargs)
File "D:\LORA\kohya_ss\venv\lib\site-packages\huggingface_hub\file_download.py", line 1431, in get_hf_file_metadata
r = _request_wrapper(
File "D:\LORA\kohya_ss\venv\lib\site-packages\huggingface_hub\file_download.py", line 405, in _request_wrapper
response = _request_wrapper(
File "D:\LORA\kohya_ss\venv\lib\site-packages\huggingface_hub\file_download.py", line 440, in _request_wrapper
return http_backoff(
File "D:\LORA\kohya_ss\venv\lib\site-packages\huggingface_hub\utils_http.py", line 129, in http_backoff
response = requests.request(method=method, url=url, kwargs)
File "D:\LORA\kohya_ss\venv\lib\site-packages\requests\api.py", line 59, in request
return session.request(method=method, url=url, kwargs)
File "D:\LORA\kohya_ss\venv\lib\site-packages\requests\sessions.py", line 587, in request
resp = self.send(prep, send_kwargs)
File "D:\LORA\kohya_ss\venv\lib\site-packages\requests\sessions.py", line 701, in send
r = adapter.send(request, **kwargs)
File "D:\LORA\kohya_ss\venv\lib\site-packages\requests\adapters.py", line 563, in send
raise SSLError(e, request=request)
requests.exceptions.SSLError: HTTPSConnectionPool(host='huggingface.co', port=443): Max retries exceeded with url: /openai/clip-vit-large-patch14/resolve/main/pytorch_model.bin (Caused by SSLError(SSLEOFError(8, 'EOF occurred in violation of protocol (_ssl.c:997)')))
Traceback (most recent call last):
File "D:\Python\lib\runpy.py", line 196, in _run_module_as_main
return _run_code(code, main_globals, None,
File "D:\Python\lib\runpy.py", line 86, in _run_code
exec(code, run_globals)
File "D:\LORA\kohya_ss\venv\Scripts\accelerate.exe__main__.py", line 7, in
File "D:\LORA\kohya_ss\venv\lib\site-packages\accelerate\commands\accelerate_cli.py", line 45, in main
args.func(args)
File "D:\LORA\kohya_ss\venv\lib\site-packages\accelerate\commands\launch.py", line 1104, in launch_command
simple_launcher(args)
File "D:\LORA\kohya_ss\venv\lib\site-packages\accelerate\commands\launch.py", line 567, in simple_launcher
raise subprocess.CalledProcessError(returncode=process.returncode, cmd=cmd)
subprocess.CalledProcessError: Command '['D:\LORA\kohya_ss\venv\Scripts\python.exe', 'train_network.py', '--enable_bucket', '--pretrained_model_name_or_path=D:/NovelAI/models/Stable-diffusion/latest.ckpt', '--train_data_dir=E:/Train/Aharen', '--resolution=512,512', '--output_dir=E:/Train/Aharen', '--logging_dir=', '--network_alpha=8', '--save_model_as=safetensors', '--network_module=networks.lora', '--text_encoder_lr=3e-5', '--unet_lr=3e-4', '--network_dim=32', '--output_name=test', '--lr_scheduler_num_cycles=20', '--learning_rate=1e-5', '--lr_scheduler=cosine_with_restarts', '--lr_warmup_steps=567', '--train_batch_size=1', '--max_train_steps=11340', '--save_every_n_epochs=1', '--mixed_precision=fp16', '--save_precision=fp16', '--seed=31337', '--caption_extension=.txt', '--cache_latents', '--clip_skip=2', '--keep_tokens=3', '--bucket_reso_steps=64', '--shuffle_caption', '--xformers', '--use_8bit_adam', '--bucket_no_upscale']' returned non-zero exit status 1.