Closed newstargo closed 1 year ago
我看你似乎使用nccl,这是只有多卡的时候才会使用到,能否启动webui的时候只指定单卡呢?
image_portrait_enhancement_pipeline
你好,这里看来是超分的时候爆显存了,是不是因为原始图片的分辨率较大呢?
@newstargo 和ISSUE30, ISSUE31 是不是可以合并处理,并在下个README做明显提示。 PR20 已修复这个问题,如果有时间可以拉取最新代码进行测试,后续我们会在24h内关闭这一ISSUE
To create a public link, set
ui.pages_contents = [pg.create_html(ui.tabname) for pg in ui.stored_extra_pages]
File "C:\Users\nsg\stable-diffusion-webui\modules\ui_extra_networks.py", line 162, in create_html
self.items = {x["name"]: x for x in self.list_items()}
File "C:\Users\nsg\stable-diffusion-webui\modules\ui_extra_networks.py", line 162, in
self.items = {x["name"]: x for x in self.list_items()}
File "C:\Users\nsg\stable-diffusion-webui\modules\ui_extra_networks_checkpoints.py", line 35, in list_items
yield self.create_item(name, index)
File "C:\Users\nsg\stable-diffusion-webui\modules\ui_extra_networks_checkpoints.py", line 18, in create_item
path, ext = os.path.splitext(checkpoint.filename)
AttributeError: 'NoneType' object has no attribute 'filename'
Start Downloading weights
Traceback (most recent call last):
File "C:\Users\nsg\stable-diffusion-webui\venv\lib\site-packages\gradio\routes.py", line 488, in run_predict
output = await app.get_blocks().process_api(
File "C:\Users\nsg\stable-diffusion-webui\venv\lib\site-packages\gradio\blocks.py", line 1431, in process_api
result = await self.call_function(
File "C:\Users\nsg\stable-diffusion-webui\venv\lib\site-packages\gradio\blocks.py", line 1103, in call_function
prediction = await anyio.to_thread.run_sync(
File "C:\Users\nsg\stable-diffusion-webui\venv\lib\site-packages\anyio\to_thread.py", line 33, in run_sync
return await get_asynclib().run_sync_in_worker_thread(
File "C:\Users\nsg\stable-diffusion-webui\venv\lib\site-packages\anyio_backends_asyncio.py", line 877, in run_sync_in_worker_thread
return await future
File "C:\Users\nsg\stable-diffusion-webui\venv\lib\site-packages\anyio_backends_asyncio.py", line 807, in run
result = context.run(func, args)
File "C:\Users\nsg\stable-diffusion-webui\venv\lib\site-packages\gradio\utils.py", line 707, in wrapper
response = f(args, kwargs)
File "C:\Users\nsg\stable-diffusion-webui\extensions\sd-webui-EasyPhoto\scripts\easyphoto_train.py", line 115, in easyphoto_train_forward
original_backup_path = os.path.join(user_id_outpath_samples, user_id, "original_backup")
File "C:\Users\nsg\AppData\Local\Programs\Python\Python310\lib\ntpath.py", line 143, in join
genericpath._check_arg_types('join', path, *paths)
File "C:\Users\nsg\AppData\Local\Programs\Python\Python310\lib\genericpath.py", line 152, in _check_arg_types
raise TypeError(f'{funcname}() argument must be str, bytes, or '
TypeError: join() argument must be str, bytes, or os.PathLike object, not 'NoneType'
Start Downloading weights
Applied providers: ['CPUExecutionProvider'], with options: {'CPUExecutionProvider': {}}
Applied providers: ['CPUExecutionProvider'], with options: {'CPUExecutionProvider': {}}
find model: C:\Users\nsg\stable-diffusion-webui\extensions\sd-webui-EasyPhoto\models\buffalo_l\2d106det.onnx landmark_2d_106 ['None', 3, 192, 192] 0.0 1.0
Applied providers: ['CPUExecutionProvider'], with options: {'CPUExecutionProvider': {}}
find model: C:\Users\nsg\stable-diffusion-webui\extensions\sd-webui-EasyPhoto\models\buffalo_l\det_10g.onnx detection [1, 3, '?', '?'] 127.5 128.0
Applied providers: ['CPUExecutionProvider'], with options: {'CPUExecutionProvider': {}}
find model: C:\Users\nsg\stable-diffusion-webui\extensions\sd-webui-EasyPhoto\models\buffalo_l\w600k_r50.onnx recognition ['None', 3, 112, 112] 127.5 127.5
set det-size: (640, 640)
2023-09-06 16:59:22,098 - modelscope - INFO - Model revision not specified, use the latest revision: v2.0.2
2023-09-06 16:59:24,605 - modelscope - INFO - initiate model from C:\Users\nsg.cache\modelscope\hub\damo\cv_resnet50_face-detection_retinaface
2023-09-06 16:59:24,606 - modelscope - INFO - initiate model from location C:\Users\nsg.cache\modelscope\hub\damo\cv_resnet50_face-detection_retinaface.
2023-09-06 16:59:24,617 - modelscope - WARNING - No preprocessor field found in cfg.
2023-09-06 16:59:24,618 - modelscope - WARNING - No val key and type key found in preprocessor domain of configuration.json file.
2023-09-06 16:59:24,619 - modelscope - WARNING - Cannot find available config to build preprocessor at mode inference, current config: {'model_dir': 'C:\Users\nsg\.cache\modelscope\hub\damo\cv_resnet50_face-detection_retinaface'}. trying to build by task and model information.
2023-09-06 16:59:24,620 - modelscope - WARNING - Find task: face-detection, model type: None. Insufficient information to build preprocessor, skip building preprocessor
2023-09-06 16:59:24,624 - modelscope - INFO - loading model from C:\Users\nsg.cache\modelscope\hub\damo\cv_resnet50_face-detection_retinaface\pytorch_model.pt
2023-09-06 16:59:25,371 - modelscope - INFO - load model done
2023-09-06 16:59:27,373 - modelscope - INFO - Model revision not specified, use the latest revision: v1.0.0
2023-09-06 16:59:28,111 - modelscope - INFO - initiate model from C:\Users\nsg.cache\modelscope\hub\damo\cv_u2net_salient-detection
2023-09-06 16:59:28,112 - modelscope - INFO - initiate model from location C:\Users\nsg.cache\modelscope\hub\damo\cv_u2net_salient-detection.
2023-09-06 16:59:28,119 - modelscope - INFO - initialize model from C:\Users\nsg.cache\modelscope\hub\damo\cv_u2net_salient-detection
2023-09-06 16:59:28,741 - modelscope - WARNING - No preprocessor field found in cfg.
2023-09-06 16:59:28,741 - modelscope - WARNING - No val key and type key found in preprocessor domain of configuration.json file.
2023-09-06 16:59:28,743 - modelscope - WARNING - Cannot find available config to build preprocessor at mode inference, current config: {'model_dir': 'C:\Users\nsg\.cache\modelscope\hub\damo\cv_u2net_salient-detection'}. trying to build by task and model information.
2023-09-06 16:59:28,744 - modelscope - WARNING - No preprocessor key ('detection', 'semantic-segmentation') found in PREPROCESSOR_MAP, skip building preprocessor.
2023-09-06 16:59:30,765 - modelscope - INFO - Use user-specified model revision: v1.0.2
2023-09-06 16:59:31,584 - modelscope - WARNING - ('PIPELINES', 'skin-retouching-torch', 'skin-retouching-torch') not found in ast index file
2023-09-06 16:59:31,585 - modelscope - INFO - initiate model from C:\Users\nsg.cache\modelscope\hub\damo\cv_unet_skin_retouching_torch
2023-09-06 16:59:31,588 - modelscope - INFO - initiate model from location C:\Users\nsg.cache\modelscope\hub\damo\cv_unet_skin_retouching_torch.
2023-09-06 16:59:31,597 - modelscope - WARNING - No preprocessor field found in cfg.
2023-09-06 16:59:31,597 - modelscope - WARNING - No val key and type key found in preprocessor domain of configuration.json file.
2023-09-06 16:59:31,598 - modelscope - WARNING - Cannot find available config to build preprocessor at mode inference, current config: {'model_dir': 'C:\Users\nsg\.cache\modelscope\hub\damo\cv_unet_skin_retouching_torch'}. trying to build by task and model information.
2023-09-06 16:59:31,599 - modelscope - WARNING - Find task: skin-retouching-torch, model type: None. Insufficient information to build preprocessor, skip building preprocessor
2023-09-06 16:59:34,006 - modelscope - INFO - Model revision not specified, use the latest revision: v2.0.2
2023-09-06 16:59:35,990 - modelscope - INFO - initiate model from C:\Users\nsg.cache\modelscope\hub\damo\cv_resnet50_face-detection_retinaface
2023-09-06 16:59:35,991 - modelscope - INFO - initiate model from location C:\Users\nsg.cache\modelscope\hub\damo\cv_resnet50_face-detection_retinaface.
2023-09-06 16:59:36,005 - modelscope - WARNING - No preprocessor field found in cfg.
2023-09-06 16:59:36,006 - modelscope - WARNING - No val key and type key found in preprocessor domain of configuration.json file.
2023-09-06 16:59:36,007 - modelscope - WARNING - Cannot find available config to build preprocessor at mode inference, current config: {'model_dir': 'C:\Users\nsg\.cache\modelscope\hub\damo\cv_resnet50_face-detection_retinaface'}. trying to build by task and model information.
2023-09-06 16:59:36,008 - modelscope - WARNING - Find task: face-detection, model type: None. Insufficient information to build preprocessor, skip building preprocessor
2023-09-06 16:59:36,014 - modelscope - INFO - loading model from C:\Users\nsg.cache\modelscope\hub\damo\cv_resnet50_face-detection_retinaface\pytorch_model.pt
2023-09-06 16:59:36,752 - modelscope - INFO - load model done
2023-09-06 16:59:41,415 - modelscope - INFO - Model revision not specified, use the latest revision: v1.0.0
2023-09-06 16:59:42,207 - modelscope - INFO - initiate model from C:\Users\nsg.cache\modelscope\hub\damo\cv_gpen_image-portrait-enhancement
2023-09-06 16:59:42,208 - modelscope - INFO - initiate model from location C:\Users\nsg.cache\modelscope\hub\damo\cv_gpen_image-portrait-enhancement.
2023-09-06 16:59:42,214 - modelscope - INFO - initialize model from C:\Users\nsg.cache\modelscope\hub\damo\cv_gpen_image-portrait-enhancement
Loading ResNet ArcFace
2023-09-06 16:59:46,402 - modelscope - INFO - load face enhancer model done
2023-09-06 16:59:47,087 - modelscope - INFO - load face detector model done
2023-09-06 16:59:47,939 - modelscope - INFO - load sr model done
2023-09-06 16:59:49,521 - modelscope - INFO - load fqa model done
100%|████████████████████████████████████████████████████████████████████████████████████| 5/5 [00:06<00:00, 1.40s/it]
selected paths: C:\Users\nsg\stable-diffusion-webui\outputs/easyphoto-user-id-infos\jacky-5\original_backup\2.jpg total scores: 0.6295568212008975 face angles 0.9987909946784695
selected paths: C:\Users\nsg\stable-diffusion-webui\outputs/easyphoto-user-id-infos\jacky-5\original_backup\0.jpg total scores: 0.620656322531957 face angles 0.9955321676581762
selected paths: C:\Users\nsg\stable-diffusion-webui\outputs/easyphoto-user-id-infos\jacky-5\original_backup\4.jpg total scores: 0.6175696108051335 face angles 0.9642633823436466
selected paths: C:\Users\nsg\stable-diffusion-webui\outputs/easyphoto-user-id-infos\jacky-5\original_backup\3.jpg total scores: 0.5971493601266059 face angles 0.9678671137462829
selected paths: C:\Users\nsg\stable-diffusion-webui\outputs/easyphoto-user-id-infos\jacky-5\original_backup\1.jpg total scores: 0.19984603786853658 face angles 0.9990114183647643
jpg: 4.jpg face_id_scores 0.6175696108051335
jpg: 2.jpg face_id_scores 0.6295568212008975
jpg: 0.jpg face_id_scores 0.620656322531957
jpg: 3.jpg face_id_scores 0.5971493601266059
jpg: 1.jpg face_id_scores 0.19984603786853658
0it [00:00, ?it/s]2023-09-06 16:59:57,343 - modelscope - WARNING - task skin-retouching-torch input definition is missing
2023-09-06 17:00:02,053 - modelscope - WARNING - task skin-retouching-torch output keys are missing
3it [00:24, 8.01s/it]
Exception in thread Thread-41 (preprocess_images):
Traceback (most recent call last):
File "C:\Users\nsg\AppData\Local\Programs\Python\Python310\lib\threading.py", line 1016, in _bootstrap_inner
self.run()
File "C:\Users\nsg\AppData\Local\Programs\Python\Python310\lib\threading.py", line 953, in run
self._target(*self._args, self._kwargs)
File "C:\Users\nsg\stable-diffusion-webui\extensions\sd-webui-EasyPhoto\scripts\preprocess.py", line 125, in preprocess_images
sub_image = Image.fromarray(cv2.cvtColor(portrait_enhancement(sub_image)[OutputKeys.OUTPUT_IMG], cv2.COLOR_BGR2RGB))
File "C:\Users\nsg\stable-diffusion-webui\venv\lib\site-packages\modelscope\pipelines\base.py", line 219, in call
output = self._process_single(input, args, kwargs)
File "C:\Users\nsg\stable-diffusion-webui\venv\lib\site-packages\modelscope\pipelines\base.py", line 247, in _process_single
out = self.preprocess(input, preprocess_params)
File "C:\Users\nsg\stable-diffusion-webui\venv\lib\site-packages\modelscope\pipelines\cv\image_portrait_enhancement_pipeline.py", line 178, in preprocess
img_sr = self.sr_process(img)
File "C:\Users\nsg\stable-diffusion-webui\venv\lib\site-packages\modelscope\pipelines\cv\image_portrait_enhancement_pipeline.py", line 161, in sr_process
output = self.sr_model(img)
File "C:\Users\nsg\stable-diffusion-webui\venv\lib\site-packages\torch\nn\modules\module.py", line 1501, in _call_impl
return forward_call(args, kwargs)
File "C:\Users\nsg\stable-diffusion-webui\venv\lib\site-packages\modelscope\models\cv\super_resolution\rrdbnet_arch.py", line 123, in forward
body_feat = self.conv_body(self.body(feat))
File "C:\Users\nsg\stable-diffusion-webui\venv\lib\site-packages\torch\nn\modules\module.py", line 1501, in _call_impl
return forward_call(*args, kwargs)
File "C:\Users\nsg\stable-diffusion-webui\venv\lib\site-packages\torch\nn\modules\container.py", line 217, in forward
input = module(input)
File "C:\Users\nsg\stable-diffusion-webui\venv\lib\site-packages\torch\nn\modules\module.py", line 1501, in _call_impl
return forward_call(*args, *kwargs)
File "C:\Users\nsg\stable-diffusion-webui\venv\lib\site-packages\modelscope\models\cv\super_resolution\rrdbnet_arch.py", line 63, in forward
out = self.rdb1(x)
File "C:\Users\nsg\stable-diffusion-webui\venv\lib\site-packages\torch\nn\modules\module.py", line 1501, in _call_impl
return forward_call(args, kwargs)
File "C:\Users\nsg\stable-diffusion-webui\venv\lib\site-packages\modelscope\models\cv\super_resolution\rrdbnet_arch.py", line 39, in forward
x3 = self.lrelu(self.conv3(torch.cat((x, x1, x2), 1)))
File "C:\Users\nsg\stable-diffusion-webui\venv\lib\site-packages\torch\nn\modules\module.py", line 1501, in _call_impl
return forward_call(*args, kwargs)
File "C:\Users\nsg\stable-diffusion-webui\extensions-builtin\Lora\networks.py", line 444, in network_Conv2d_forward
return originals.Conv2d_forward(self, input)
File "C:\Users\nsg\stable-diffusion-webui\venv\lib\site-packages\torch\nn\modules\conv.py", line 463, in forward
return self._conv_forward(input, self.weight, self.bias)
File "C:\Users\nsg\stable-diffusion-webui\venv\lib\site-packages\torch\nn\modules\conv.py", line 459, in _conv_forward
return F.conv2d(input, weight, bias, self.stride,
torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate 322.00 MiB (GPU 0; 12.00 GiB total capacity; 10.89 GiB already allocated; 0 bytes free; 11.08 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF
C:\Users\nsg\stable-diffusion-webui\extensions\sd-webui-EasyPhoto\scripts\train_kohya/train_lora.py
The following values were not passed to
main()
File "C:\Users\nsg\stable-diffusion-webui\extensions\sd-webui-EasyPhoto\scripts\train_kohya\train_lora.py", line 803, in main
accelerator = Accelerator(
File "C:\Users\nsg\stable-diffusion-webui\venv\lib\site-packages\accelerate\accelerator.py", line 358, in init
self.state = AcceleratorState(
File "C:\Users\nsg\stable-diffusion-webui\venv\lib\site-packages\accelerate\state.py", line 720, in init
PartialState(cpu, kwargs)
File "C:\Users\nsg\stable-diffusion-webui\venv\lib\site-packages\accelerate\state.py", line 192, in init
torch.distributed.init_process_group(backend=self.backend, **kwargs)
File "C:\Users\nsg\stable-diffusion-webui\venv\lib\site-packages\torch\distributed\distributed_c10d.py", line 907, in init_process_group
Traceback (most recent call last):
File "C:\Users\nsg\stable-diffusion-webui\extensions\sd-webui-EasyPhoto\scripts\train_kohya\train_lora.py", line 1394, in
default_pg = _new_process_group_helper(
File "C:\Users\nsg\stable-diffusion-webui\venv\lib\site-packages\torch\distributed\distributed_c10d.py", line 1013, in _new_process_group_helper
main()
raise RuntimeError("Distributed package doesn't have NCCL " "built in") File "C:\Users\nsg\stable-diffusion-webui\extensions\sd-webui-EasyPhoto\scripts\train_kohya\train_lora.py", line 803, in main
share=True
inlaunch()
. Startup time: 133.6s (prepare environment: 102.3s, import torch: 7.7s, import gradio: 0.7s, setup paths: 0.9s, initialize shared: 0.5s, other imports: 0.6s, setup codeformer: 0.2s, load scripts: 11.3s, initialize extra networks: 1.0s, scripts before_ui_callback: 0.3s, create ui: 3.3s, gradio launch: 4.5s, add APIs: 0.1s). Applying attention optimization: xformers... done. Model loaded in 14.0s (load weights from disk: 1.5s, create model: 1.8s, apply weights to model: 6.9s, move model to device: 0.2s, load textual inversion embeddings: 1.6s, calculate empty prompt: 1.9s). Traceback (most recent call last): File "C:\Users\nsg\stable-diffusion-webui\venv\lib\site-packages\gradio\routes.py", line 488, in run_predict output = await app.get_blocks().process_api( File "C:\Users\nsg\stable-diffusion-webui\venv\lib\site-packages\gradio\blocks.py", line 1431, in process_api result = await self.call_function( File "C:\Users\nsg\stable-diffusion-webui\venv\lib\site-packages\gradio\blocks.py", line 1103, in call_function prediction = await anyio.to_thread.run_sync( File "C:\Users\nsg\stable-diffusion-webui\venv\lib\site-packages\anyio\to_thread.py", line 33, in run_sync return await get_asynclib().run_sync_in_worker_thread( File "C:\Users\nsg\stable-diffusion-webui\venv\lib\site-packages\anyio_backends_asyncio.py", line 877, in run_sync_in_worker_thread return await future File "C:\Users\nsg\stable-diffusion-webui\venv\lib\site-packages\anyio_backends_asyncio.py", line 807, in run result = context.run(func, args) File "C:\Users\nsg\stable-diffusion-webui\venv\lib\site-packages\gradio\utils.py", line 707, in wrapper response = f(args, kwargs) File "C:\Users\nsg\stable-diffusion-webui\modules\ui_extra_networks.py", line 392, in pages_html return refresh() File "C:\Users\nsg\stable-diffusion-webui\modules\ui_extra_networks.py", line 400, in refresh ui.pages_contents = [pg.create_html(ui.tabname) for pg in ui.stored_extra_pages] File "C:\Users\nsg\stable-diffusion-webui\modules\ui_extra_networks.py", line 400, inaccelerate launch
and had defaults used instead:--num_processes
was set to a value of2
More than one GPU was found, enabling multi-GPU training. If this was unintended please pass in--num_processes=1
.--num_machines
was set to a value of1
--dynamo_backend
was set to a value of'no'
To avoid this warning pass in values for each of the problematic parameters or runaccelerate config
. NOTE: Redirects are currently not supported in Windows or MacOs. [W ..\torch\csrc\distributed\c10d\socket.cpp:601] [c10d] The client socket has failed to connect to [USER-20230706TY]:3456 (system error: 10049 - 在其上下文中,该请求的地址无效。). [W ..\torch\csrc\distributed\c10d\socket.cpp:601] [c10d] The client socket has failed to connect to [USER-20230706TY]:3456 (system error: 10049 - 在其上下文中,该请求的地址无效。). A matching Triton is not available, some optimizations will not be enabled. Error caught was: No module named 'triton' A matching Triton is not available, some optimizations will not be enabled. Error caught was: No module named 'triton' 2023-09-06 17:00:40,304 - modelscope - INFO - PyTorch version 2.0.1+cu118 Found. 2023-09-06 17:00:40,309 - modelscope - INFO - TensorFlow version 2.13.0 Found. 2023-09-06 17:00:40,309 - modelscope - INFO - Loading ast index from C:\Users\nsg.cache\modelscope\ast_indexer 2023-09-06 17:00:40,338 - modelscope - INFO - PyTorch version 2.0.1+cu118 Found. 2023-09-06 17:00:40,343 - modelscope - INFO - TensorFlow version 2.13.0 Found. 2023-09-06 17:00:40,344 - modelscope - INFO - Loading ast index from C:\Users\nsg.cache\modelscope\ast_indexer 2023-09-06 17:00:40,457 - modelscope - INFO - Loading done! Current index file version is 1.9.0, with md5 ebb1c3c0522899612853064e3129f6d1 and a total number of 921 components indexed 2023-09-06 17:00:40,464 - modelscope - INFO - Loading done! Current index file version is 1.9.0, with md5 ebb1c3c0522899612853064e3129f6d1 and a total number of 921 components indexed [W ..\torch\csrc\distributed\c10d\socket.cpp:601] [c10d] The client socket has failed to connect to [USER-20230706TY]:3456 (system error: 10049 - 在其上下文中,该请求的地址无效。). [W ..\torch\csrc\distributed\c10d\socket.cpp:601] [c10d] The client socket has failed to connect to [USER-20230706TY]:3456 (system error: 10049 - 在其上下文中,该请求的地址无效。). [W ..\torch\csrc\distributed\c10d\socket.cpp:601] [c10d] The client socket has failed to connect to [USER-20230706TY]:3456 (system error: 10049 - 在其上下文中,该请求的地址无效。). [W ..\torch\csrc\distributed\c10d\socket.cpp:601] [c10d] The client socket has failed to connect to [USER-20230706TY]:3456 (system error: 10049 - 在其上下文中,该请求的地址无效。). Traceback (most recent call last): File "C:\Users\nsg\stable-diffusion-webui\extensions\sd-webui-EasyPhoto\scripts\train_kohya\train_lora.py", line 1394, inRuntimeError: Distributed package doesn't have NCCL built inaccelerator = Accelerator(
File "C:\Users\nsg\stable-diffusion-webui\venv\lib\site-packages\accelerate\accelerator.py", line 358, in init self.state = AcceleratorState( File "C:\Users\nsg\stable-diffusion-webui\venv\lib\site-packages\accelerate\state.py", line 720, in init PartialState(cpu, kwargs) File "C:\Users\nsg\stable-diffusion-webui\venv\lib\site-packages\accelerate\state.py", line 192, in init torch.distributed.init_process_group(backend=self.backend, kwargs) File "C:\Users\nsg\stable-diffusion-webui\venv\lib\site-packages\torch\distributed\distributed_c10d.py", line 907, in init_process_group default_pg = _new_process_group_helper( File "C:\Users\nsg\stable-diffusion-webui\venv\lib\site-packages\torch\distributed\distributed_c10d.py", line 1013, in _new_process_group_helper raise RuntimeError("Distributed package doesn't have NCCL " "built in") RuntimeError: Distributed package doesn't have NCCL built in ERROR:torch.distributed.elastic.multiprocessing.api:failed (exitcode: 1) local_rank: 0 (pid: 4880) of binary: C:\Users\nsg\stable-diffusion-webui\venv\Scripts\python.exe Traceback (most recent call last): File "C:\Users\nsg\AppData\Local\Programs\Python\Python310\lib\runpy.py", line 196, in _run_module_as_main return _run_code(code, main_globals, None, File "C:\Users\nsg\AppData\Local\Programs\Python\Python310\lib\runpy.py", line 86, in _run_code exec(code, run_globals) File "C:\Users\nsg\stable-diffusion-webui\venv\lib\site-packages\accelerate\commands\launch.py", line 989, in
main()
File "C:\Users\nsg\stable-diffusion-webui\venv\lib\site-packages\accelerate\commands\launch.py", line 985, in main
launch_command(args)
File "C:\Users\nsg\stable-diffusion-webui\venv\lib\site-packages\accelerate\commands\launch.py", line 970, in launch_command
multi_gpu_launcher(args)
File "C:\Users\nsg\stable-diffusion-webui\venv\lib\site-packages\accelerate\commands\launch.py", line 646, in multi_gpu_launcher
distrib_run.run(args)
File "C:\Users\nsg\stable-diffusion-webui\venv\lib\site-packages\torch\distributed\run.py", line 785, in run
elastic_launch(
File "C:\Users\nsg\stable-diffusion-webui\venv\lib\site-packages\torch\distributed\launcher\api.py", line 134, in call
return launch_agent(self._config, self._entrypoint, list(args))
File "C:\Users\nsg\stable-diffusion-webui\venv\lib\site-packages\torch\distributed\launcher\api.py", line 250, in launch_agent
raise ChildFailedError(
torch.distributed.elastic.multiprocessing.errors.ChildFailedError:
C:\Users\nsg\stable-diffusion-webui\extensions\sd-webui-EasyPhoto\scripts\train_kohya/train_lora.py FAILED
Failures: [1]: time : 2023-09-06_17:00:46 host : USER-20230706TY rank : 1 (local_rank: 1) exitcode : 1 (pid: 9704) error_file: <N/A> traceback : To enable traceback see: https://pytorch.org/docs/stable/elastic/errors.html
Root Cause (first observed failure): [0]: time : 2023-09-06_17:00:46 host : USER-20230706TY rank : 0 (local_rank: 0) exitcode : 1 (pid: 4880) error_file: <N/A> traceback : To enable traceback see: https://pytorch.org/docs/stable/elastic/errors.html
Error executing the command: Command '['C:\Users\nsg\stable-diffusion-webui\venv\Scripts\python.exe', '-m', 'accelerate.commands.launch', '--mixed_precision=fp16', '--main_process_port=3456', 'C:\Users\nsg\stable-diffusion-webui\extensions\sd-webui-EasyPhoto\scripts\train_kohya/train_lora.py', '--pretrained_model_name_or_path=extensions\sd-webui-EasyPhoto\models\stable-diffusion-v1-5', '--pretrained_model_ckpt=models\Stable-diffusion\Chilloutmix-Ni-pruned-fp16-fix.safetensors', '--train_data_dir=outputs\easyphoto-user-id-infos\jacky-5\processed_images', '--caption_column=text', '--resolution=512', '--random_flip', '--train_batch_size=1', '--gradient_accumulation_steps=4', '--dataloader_num_workers=0', '--max_train_steps=800', '--checkpointing_steps=100', '--learning_rate=0.0001', '--lr_scheduler=constant', '--lr_warmup_steps=0', '--train_text_encoder', '--seed=42', '--rank=128', '--network_alpha=64', '--validation_prompt=easyphoto_face, easyphoto, 1person', '--validation_steps=100', '--output_dir=outputs\easyphoto-user-id-infos\jacky-5\user_weights', '--logging_dir=outputs\easyphoto-user-id-infos\jacky-5\user_weights', '--enable_xformers_memory_efficient_attention', '--mixed_precision=fp16', '--template_dir=extensions\sd-webui-EasyPhoto\models\training_templates', '--template_mask', '--merge_best_lora_based_face_id', '--merge_best_lora_name=jacky-5']' returned non-zero exit status 1. Traceback (most recent call last): File "C:\Users\nsg\stable-diffusion-webui\venv\lib\site-packages\gradio\routes.py", line 488, in run_predict output = await app.get_blocks().process_api( File "C:\Users\nsg\stable-diffusion-webui\venv\lib\site-packages\gradio\blocks.py", line 1431, in process_api result = await self.call_function( File "C:\Users\nsg\stable-diffusion-webui\venv\lib\site-packages\gradio\blocks.py", line 1103, in call_function prediction = await anyio.to_thread.run_sync( File "C:\Users\nsg\stable-diffusion-webui\venv\lib\site-packages\anyio\to_thread.py", line 33, in run_sync return await get_asynclib().run_sync_in_worker_thread( File "C:\Users\nsg\stable-diffusion-webui\venv\lib\site-packages\anyio_backends_asyncio.py", line 877, in run_sync_in_worker_thread return await future File "C:\Users\nsg\stable-diffusion-webui\venv\lib\site-packages\anyio_backends_asyncio.py", line 807, in run result = context.run(func, args) File "C:\Users\nsg\stable-diffusion-webui\venv\lib\site-packages\gradio\utils.py", line 707, in wrapper response = f(args, **kwargs) File "C:\Users\nsg\stable-diffusion-webui\extensions\sd-webui-EasyPhoto\scripts\easyphoto_train.py", line 218, in easyphoto_train_forward copyfile(best_weight_path, webui_save_path) File "C:\Users\nsg\AppData\Local\Programs\Python\Python310\lib\shutil.py", line 254, in copyfile with open(src, 'rb') as fsrc: FileNotFoundError: [Errno 2] No such file or directory: 'C:\Users\nsg\stable-diffusion-webui\outputs/easyphoto-user-id-infos\jacky-5\user_weights\best_outputs/jacky-5.safetensors'