aigc-apps / sd-webui-EasyPhoto

📷 EasyPhoto | Your Smart AI Photo Generator.
Apache License 2.0
4.98k stars 399 forks source link

第一次运行,哪里出错了? #26

Closed newstargo closed 1 year ago

newstargo commented 1 year ago

To create a public link, set share=True in launch(). Startup time: 133.6s (prepare environment: 102.3s, import torch: 7.7s, import gradio: 0.7s, setup paths: 0.9s, initialize shared: 0.5s, other imports: 0.6s, setup codeformer: 0.2s, load scripts: 11.3s, initialize extra networks: 1.0s, scripts before_ui_callback: 0.3s, create ui: 3.3s, gradio launch: 4.5s, add APIs: 0.1s). Applying attention optimization: xformers... done. Model loaded in 14.0s (load weights from disk: 1.5s, create model: 1.8s, apply weights to model: 6.9s, move model to device: 0.2s, load textual inversion embeddings: 1.6s, calculate empty prompt: 1.9s). Traceback (most recent call last): File "C:\Users\nsg\stable-diffusion-webui\venv\lib\site-packages\gradio\routes.py", line 488, in run_predict output = await app.get_blocks().process_api( File "C:\Users\nsg\stable-diffusion-webui\venv\lib\site-packages\gradio\blocks.py", line 1431, in process_api result = await self.call_function( File "C:\Users\nsg\stable-diffusion-webui\venv\lib\site-packages\gradio\blocks.py", line 1103, in call_function prediction = await anyio.to_thread.run_sync( File "C:\Users\nsg\stable-diffusion-webui\venv\lib\site-packages\anyio\to_thread.py", line 33, in run_sync return await get_asynclib().run_sync_in_worker_thread( File "C:\Users\nsg\stable-diffusion-webui\venv\lib\site-packages\anyio_backends_asyncio.py", line 877, in run_sync_in_worker_thread return await future File "C:\Users\nsg\stable-diffusion-webui\venv\lib\site-packages\anyio_backends_asyncio.py", line 807, in run result = context.run(func, args) File "C:\Users\nsg\stable-diffusion-webui\venv\lib\site-packages\gradio\utils.py", line 707, in wrapper response = f(args, kwargs) File "C:\Users\nsg\stable-diffusion-webui\modules\ui_extra_networks.py", line 392, in pages_html return refresh() File "C:\Users\nsg\stable-diffusion-webui\modules\ui_extra_networks.py", line 400, in refresh ui.pages_contents = [pg.create_html(ui.tabname) for pg in ui.stored_extra_pages] File "C:\Users\nsg\stable-diffusion-webui\modules\ui_extra_networks.py", line 400, in ui.pages_contents = [pg.create_html(ui.tabname) for pg in ui.stored_extra_pages] File "C:\Users\nsg\stable-diffusion-webui\modules\ui_extra_networks.py", line 162, in create_html self.items = {x["name"]: x for x in self.list_items()} File "C:\Users\nsg\stable-diffusion-webui\modules\ui_extra_networks.py", line 162, in self.items = {x["name"]: x for x in self.list_items()} File "C:\Users\nsg\stable-diffusion-webui\modules\ui_extra_networks_checkpoints.py", line 35, in list_items yield self.create_item(name, index) File "C:\Users\nsg\stable-diffusion-webui\modules\ui_extra_networks_checkpoints.py", line 18, in create_item path, ext = os.path.splitext(checkpoint.filename) AttributeError: 'NoneType' object has no attribute 'filename' Start Downloading weights Traceback (most recent call last): File "C:\Users\nsg\stable-diffusion-webui\venv\lib\site-packages\gradio\routes.py", line 488, in run_predict output = await app.get_blocks().process_api( File "C:\Users\nsg\stable-diffusion-webui\venv\lib\site-packages\gradio\blocks.py", line 1431, in process_api result = await self.call_function( File "C:\Users\nsg\stable-diffusion-webui\venv\lib\site-packages\gradio\blocks.py", line 1103, in call_function prediction = await anyio.to_thread.run_sync( File "C:\Users\nsg\stable-diffusion-webui\venv\lib\site-packages\anyio\to_thread.py", line 33, in run_sync return await get_asynclib().run_sync_in_worker_thread( File "C:\Users\nsg\stable-diffusion-webui\venv\lib\site-packages\anyio_backends_asyncio.py", line 877, in run_sync_in_worker_thread return await future File "C:\Users\nsg\stable-diffusion-webui\venv\lib\site-packages\anyio_backends_asyncio.py", line 807, in run result = context.run(func, args) File "C:\Users\nsg\stable-diffusion-webui\venv\lib\site-packages\gradio\utils.py", line 707, in wrapper response = f(args, kwargs) File "C:\Users\nsg\stable-diffusion-webui\extensions\sd-webui-EasyPhoto\scripts\easyphoto_train.py", line 115, in easyphoto_train_forward original_backup_path = os.path.join(user_id_outpath_samples, user_id, "original_backup") File "C:\Users\nsg\AppData\Local\Programs\Python\Python310\lib\ntpath.py", line 143, in join genericpath._check_arg_types('join', path, *paths) File "C:\Users\nsg\AppData\Local\Programs\Python\Python310\lib\genericpath.py", line 152, in _check_arg_types raise TypeError(f'{funcname}() argument must be str, bytes, or ' TypeError: join() argument must be str, bytes, or os.PathLike object, not 'NoneType' Start Downloading weights Applied providers: ['CPUExecutionProvider'], with options: {'CPUExecutionProvider': {}} Applied providers: ['CPUExecutionProvider'], with options: {'CPUExecutionProvider': {}} find model: C:\Users\nsg\stable-diffusion-webui\extensions\sd-webui-EasyPhoto\models\buffalo_l\2d106det.onnx landmark_2d_106 ['None', 3, 192, 192] 0.0 1.0 Applied providers: ['CPUExecutionProvider'], with options: {'CPUExecutionProvider': {}} find model: C:\Users\nsg\stable-diffusion-webui\extensions\sd-webui-EasyPhoto\models\buffalo_l\det_10g.onnx detection [1, 3, '?', '?'] 127.5 128.0 Applied providers: ['CPUExecutionProvider'], with options: {'CPUExecutionProvider': {}} find model: C:\Users\nsg\stable-diffusion-webui\extensions\sd-webui-EasyPhoto\models\buffalo_l\w600k_r50.onnx recognition ['None', 3, 112, 112] 127.5 127.5 set det-size: (640, 640) 2023-09-06 16:59:22,098 - modelscope - INFO - Model revision not specified, use the latest revision: v2.0.2 2023-09-06 16:59:24,605 - modelscope - INFO - initiate model from C:\Users\nsg.cache\modelscope\hub\damo\cv_resnet50_face-detection_retinaface 2023-09-06 16:59:24,606 - modelscope - INFO - initiate model from location C:\Users\nsg.cache\modelscope\hub\damo\cv_resnet50_face-detection_retinaface. 2023-09-06 16:59:24,617 - modelscope - WARNING - No preprocessor field found in cfg. 2023-09-06 16:59:24,618 - modelscope - WARNING - No val key and type key found in preprocessor domain of configuration.json file. 2023-09-06 16:59:24,619 - modelscope - WARNING - Cannot find available config to build preprocessor at mode inference, current config: {'model_dir': 'C:\Users\nsg\.cache\modelscope\hub\damo\cv_resnet50_face-detection_retinaface'}. trying to build by task and model information. 2023-09-06 16:59:24,620 - modelscope - WARNING - Find task: face-detection, model type: None. Insufficient information to build preprocessor, skip building preprocessor 2023-09-06 16:59:24,624 - modelscope - INFO - loading model from C:\Users\nsg.cache\modelscope\hub\damo\cv_resnet50_face-detection_retinaface\pytorch_model.pt 2023-09-06 16:59:25,371 - modelscope - INFO - load model done 2023-09-06 16:59:27,373 - modelscope - INFO - Model revision not specified, use the latest revision: v1.0.0 2023-09-06 16:59:28,111 - modelscope - INFO - initiate model from C:\Users\nsg.cache\modelscope\hub\damo\cv_u2net_salient-detection 2023-09-06 16:59:28,112 - modelscope - INFO - initiate model from location C:\Users\nsg.cache\modelscope\hub\damo\cv_u2net_salient-detection. 2023-09-06 16:59:28,119 - modelscope - INFO - initialize model from C:\Users\nsg.cache\modelscope\hub\damo\cv_u2net_salient-detection 2023-09-06 16:59:28,741 - modelscope - WARNING - No preprocessor field found in cfg. 2023-09-06 16:59:28,741 - modelscope - WARNING - No val key and type key found in preprocessor domain of configuration.json file. 2023-09-06 16:59:28,743 - modelscope - WARNING - Cannot find available config to build preprocessor at mode inference, current config: {'model_dir': 'C:\Users\nsg\.cache\modelscope\hub\damo\cv_u2net_salient-detection'}. trying to build by task and model information. 2023-09-06 16:59:28,744 - modelscope - WARNING - No preprocessor key ('detection', 'semantic-segmentation') found in PREPROCESSOR_MAP, skip building preprocessor. 2023-09-06 16:59:30,765 - modelscope - INFO - Use user-specified model revision: v1.0.2 2023-09-06 16:59:31,584 - modelscope - WARNING - ('PIPELINES', 'skin-retouching-torch', 'skin-retouching-torch') not found in ast index file 2023-09-06 16:59:31,585 - modelscope - INFO - initiate model from C:\Users\nsg.cache\modelscope\hub\damo\cv_unet_skin_retouching_torch 2023-09-06 16:59:31,588 - modelscope - INFO - initiate model from location C:\Users\nsg.cache\modelscope\hub\damo\cv_unet_skin_retouching_torch. 2023-09-06 16:59:31,597 - modelscope - WARNING - No preprocessor field found in cfg. 2023-09-06 16:59:31,597 - modelscope - WARNING - No val key and type key found in preprocessor domain of configuration.json file. 2023-09-06 16:59:31,598 - modelscope - WARNING - Cannot find available config to build preprocessor at mode inference, current config: {'model_dir': 'C:\Users\nsg\.cache\modelscope\hub\damo\cv_unet_skin_retouching_torch'}. trying to build by task and model information. 2023-09-06 16:59:31,599 - modelscope - WARNING - Find task: skin-retouching-torch, model type: None. Insufficient information to build preprocessor, skip building preprocessor 2023-09-06 16:59:34,006 - modelscope - INFO - Model revision not specified, use the latest revision: v2.0.2 2023-09-06 16:59:35,990 - modelscope - INFO - initiate model from C:\Users\nsg.cache\modelscope\hub\damo\cv_resnet50_face-detection_retinaface 2023-09-06 16:59:35,991 - modelscope - INFO - initiate model from location C:\Users\nsg.cache\modelscope\hub\damo\cv_resnet50_face-detection_retinaface. 2023-09-06 16:59:36,005 - modelscope - WARNING - No preprocessor field found in cfg. 2023-09-06 16:59:36,006 - modelscope - WARNING - No val key and type key found in preprocessor domain of configuration.json file. 2023-09-06 16:59:36,007 - modelscope - WARNING - Cannot find available config to build preprocessor at mode inference, current config: {'model_dir': 'C:\Users\nsg\.cache\modelscope\hub\damo\cv_resnet50_face-detection_retinaface'}. trying to build by task and model information. 2023-09-06 16:59:36,008 - modelscope - WARNING - Find task: face-detection, model type: None. Insufficient information to build preprocessor, skip building preprocessor 2023-09-06 16:59:36,014 - modelscope - INFO - loading model from C:\Users\nsg.cache\modelscope\hub\damo\cv_resnet50_face-detection_retinaface\pytorch_model.pt 2023-09-06 16:59:36,752 - modelscope - INFO - load model done 2023-09-06 16:59:41,415 - modelscope - INFO - Model revision not specified, use the latest revision: v1.0.0 2023-09-06 16:59:42,207 - modelscope - INFO - initiate model from C:\Users\nsg.cache\modelscope\hub\damo\cv_gpen_image-portrait-enhancement 2023-09-06 16:59:42,208 - modelscope - INFO - initiate model from location C:\Users\nsg.cache\modelscope\hub\damo\cv_gpen_image-portrait-enhancement. 2023-09-06 16:59:42,214 - modelscope - INFO - initialize model from C:\Users\nsg.cache\modelscope\hub\damo\cv_gpen_image-portrait-enhancement Loading ResNet ArcFace 2023-09-06 16:59:46,402 - modelscope - INFO - load face enhancer model done 2023-09-06 16:59:47,087 - modelscope - INFO - load face detector model done 2023-09-06 16:59:47,939 - modelscope - INFO - load sr model done 2023-09-06 16:59:49,521 - modelscope - INFO - load fqa model done 100%|████████████████████████████████████████████████████████████████████████████████████| 5/5 [00:06<00:00, 1.40s/it] selected paths: C:\Users\nsg\stable-diffusion-webui\outputs/easyphoto-user-id-infos\jacky-5\original_backup\2.jpg total scores: 0.6295568212008975 face angles 0.9987909946784695 selected paths: C:\Users\nsg\stable-diffusion-webui\outputs/easyphoto-user-id-infos\jacky-5\original_backup\0.jpg total scores: 0.620656322531957 face angles 0.9955321676581762 selected paths: C:\Users\nsg\stable-diffusion-webui\outputs/easyphoto-user-id-infos\jacky-5\original_backup\4.jpg total scores: 0.6175696108051335 face angles 0.9642633823436466 selected paths: C:\Users\nsg\stable-diffusion-webui\outputs/easyphoto-user-id-infos\jacky-5\original_backup\3.jpg total scores: 0.5971493601266059 face angles 0.9678671137462829 selected paths: C:\Users\nsg\stable-diffusion-webui\outputs/easyphoto-user-id-infos\jacky-5\original_backup\1.jpg total scores: 0.19984603786853658 face angles 0.9990114183647643 jpg: 4.jpg face_id_scores 0.6175696108051335 jpg: 2.jpg face_id_scores 0.6295568212008975 jpg: 0.jpg face_id_scores 0.620656322531957 jpg: 3.jpg face_id_scores 0.5971493601266059 jpg: 1.jpg face_id_scores 0.19984603786853658 0it [00:00, ?it/s]2023-09-06 16:59:57,343 - modelscope - WARNING - task skin-retouching-torch input definition is missing 2023-09-06 17:00:02,053 - modelscope - WARNING - task skin-retouching-torch output keys are missing 3it [00:24, 8.01s/it] Exception in thread Thread-41 (preprocess_images): Traceback (most recent call last): File "C:\Users\nsg\AppData\Local\Programs\Python\Python310\lib\threading.py", line 1016, in _bootstrap_inner self.run() File "C:\Users\nsg\AppData\Local\Programs\Python\Python310\lib\threading.py", line 953, in run self._target(*self._args, self._kwargs) File "C:\Users\nsg\stable-diffusion-webui\extensions\sd-webui-EasyPhoto\scripts\preprocess.py", line 125, in preprocess_images sub_image = Image.fromarray(cv2.cvtColor(portrait_enhancement(sub_image)[OutputKeys.OUTPUT_IMG], cv2.COLOR_BGR2RGB)) File "C:\Users\nsg\stable-diffusion-webui\venv\lib\site-packages\modelscope\pipelines\base.py", line 219, in call output = self._process_single(input, args, kwargs) File "C:\Users\nsg\stable-diffusion-webui\venv\lib\site-packages\modelscope\pipelines\base.py", line 247, in _process_single out = self.preprocess(input, preprocess_params) File "C:\Users\nsg\stable-diffusion-webui\venv\lib\site-packages\modelscope\pipelines\cv\image_portrait_enhancement_pipeline.py", line 178, in preprocess img_sr = self.sr_process(img) File "C:\Users\nsg\stable-diffusion-webui\venv\lib\site-packages\modelscope\pipelines\cv\image_portrait_enhancement_pipeline.py", line 161, in sr_process output = self.sr_model(img) File "C:\Users\nsg\stable-diffusion-webui\venv\lib\site-packages\torch\nn\modules\module.py", line 1501, in _call_impl return forward_call(args, kwargs) File "C:\Users\nsg\stable-diffusion-webui\venv\lib\site-packages\modelscope\models\cv\super_resolution\rrdbnet_arch.py", line 123, in forward body_feat = self.conv_body(self.body(feat)) File "C:\Users\nsg\stable-diffusion-webui\venv\lib\site-packages\torch\nn\modules\module.py", line 1501, in _call_impl return forward_call(*args, kwargs) File "C:\Users\nsg\stable-diffusion-webui\venv\lib\site-packages\torch\nn\modules\container.py", line 217, in forward input = module(input) File "C:\Users\nsg\stable-diffusion-webui\venv\lib\site-packages\torch\nn\modules\module.py", line 1501, in _call_impl return forward_call(*args, *kwargs) File "C:\Users\nsg\stable-diffusion-webui\venv\lib\site-packages\modelscope\models\cv\super_resolution\rrdbnet_arch.py", line 63, in forward out = self.rdb1(x) File "C:\Users\nsg\stable-diffusion-webui\venv\lib\site-packages\torch\nn\modules\module.py", line 1501, in _call_impl return forward_call(args, kwargs) File "C:\Users\nsg\stable-diffusion-webui\venv\lib\site-packages\modelscope\models\cv\super_resolution\rrdbnet_arch.py", line 39, in forward x3 = self.lrelu(self.conv3(torch.cat((x, x1, x2), 1))) File "C:\Users\nsg\stable-diffusion-webui\venv\lib\site-packages\torch\nn\modules\module.py", line 1501, in _call_impl return forward_call(*args, kwargs) File "C:\Users\nsg\stable-diffusion-webui\extensions-builtin\Lora\networks.py", line 444, in network_Conv2d_forward return originals.Conv2d_forward(self, input) File "C:\Users\nsg\stable-diffusion-webui\venv\lib\site-packages\torch\nn\modules\conv.py", line 463, in forward return self._conv_forward(input, self.weight, self.bias) File "C:\Users\nsg\stable-diffusion-webui\venv\lib\site-packages\torch\nn\modules\conv.py", line 459, in _conv_forward return F.conv2d(input, weight, bias, self.stride, torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate 322.00 MiB (GPU 0; 12.00 GiB total capacity; 10.89 GiB already allocated; 0 bytes free; 11.08 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF C:\Users\nsg\stable-diffusion-webui\extensions\sd-webui-EasyPhoto\scripts\train_kohya/train_lora.py The following values were not passed to accelerate launch and had defaults used instead: --num_processes was set to a value of 2 More than one GPU was found, enabling multi-GPU training. If this was unintended please pass in --num_processes=1. --num_machines was set to a value of 1 --dynamo_backend was set to a value of 'no' To avoid this warning pass in values for each of the problematic parameters or run accelerate config. NOTE: Redirects are currently not supported in Windows or MacOs. [W ..\torch\csrc\distributed\c10d\socket.cpp:601] [c10d] The client socket has failed to connect to [USER-20230706TY]:3456 (system error: 10049 - 在其上下文中,该请求的地址无效。). [W ..\torch\csrc\distributed\c10d\socket.cpp:601] [c10d] The client socket has failed to connect to [USER-20230706TY]:3456 (system error: 10049 - 在其上下文中,该请求的地址无效。). A matching Triton is not available, some optimizations will not be enabled. Error caught was: No module named 'triton' A matching Triton is not available, some optimizations will not be enabled. Error caught was: No module named 'triton' 2023-09-06 17:00:40,304 - modelscope - INFO - PyTorch version 2.0.1+cu118 Found. 2023-09-06 17:00:40,309 - modelscope - INFO - TensorFlow version 2.13.0 Found. 2023-09-06 17:00:40,309 - modelscope - INFO - Loading ast index from C:\Users\nsg.cache\modelscope\ast_indexer 2023-09-06 17:00:40,338 - modelscope - INFO - PyTorch version 2.0.1+cu118 Found. 2023-09-06 17:00:40,343 - modelscope - INFO - TensorFlow version 2.13.0 Found. 2023-09-06 17:00:40,344 - modelscope - INFO - Loading ast index from C:\Users\nsg.cache\modelscope\ast_indexer 2023-09-06 17:00:40,457 - modelscope - INFO - Loading done! Current index file version is 1.9.0, with md5 ebb1c3c0522899612853064e3129f6d1 and a total number of 921 components indexed 2023-09-06 17:00:40,464 - modelscope - INFO - Loading done! Current index file version is 1.9.0, with md5 ebb1c3c0522899612853064e3129f6d1 and a total number of 921 components indexed [W ..\torch\csrc\distributed\c10d\socket.cpp:601] [c10d] The client socket has failed to connect to [USER-20230706TY]:3456 (system error: 10049 - 在其上下文中,该请求的地址无效。). [W ..\torch\csrc\distributed\c10d\socket.cpp:601] [c10d] The client socket has failed to connect to [USER-20230706TY]:3456 (system error: 10049 - 在其上下文中,该请求的地址无效。). [W ..\torch\csrc\distributed\c10d\socket.cpp:601] [c10d] The client socket has failed to connect to [USER-20230706TY]:3456 (system error: 10049 - 在其上下文中,该请求的地址无效。). [W ..\torch\csrc\distributed\c10d\socket.cpp:601] [c10d] The client socket has failed to connect to [USER-20230706TY]:3456 (system error: 10049 - 在其上下文中,该请求的地址无效。). Traceback (most recent call last): File "C:\Users\nsg\stable-diffusion-webui\extensions\sd-webui-EasyPhoto\scripts\train_kohya\train_lora.py", line 1394, in main() File "C:\Users\nsg\stable-diffusion-webui\extensions\sd-webui-EasyPhoto\scripts\train_kohya\train_lora.py", line 803, in main accelerator = Accelerator( File "C:\Users\nsg\stable-diffusion-webui\venv\lib\site-packages\accelerate\accelerator.py", line 358, in init self.state = AcceleratorState( File "C:\Users\nsg\stable-diffusion-webui\venv\lib\site-packages\accelerate\state.py", line 720, in init PartialState(cpu, kwargs) File "C:\Users\nsg\stable-diffusion-webui\venv\lib\site-packages\accelerate\state.py", line 192, in init torch.distributed.init_process_group(backend=self.backend, **kwargs) File "C:\Users\nsg\stable-diffusion-webui\venv\lib\site-packages\torch\distributed\distributed_c10d.py", line 907, in init_process_group Traceback (most recent call last): File "C:\Users\nsg\stable-diffusion-webui\extensions\sd-webui-EasyPhoto\scripts\train_kohya\train_lora.py", line 1394, in default_pg = _new_process_group_helper( File "C:\Users\nsg\stable-diffusion-webui\venv\lib\site-packages\torch\distributed\distributed_c10d.py", line 1013, in _new_process_group_helper main() raise RuntimeError("Distributed package doesn't have NCCL " "built in") File "C:\Users\nsg\stable-diffusion-webui\extensions\sd-webui-EasyPhoto\scripts\train_kohya\train_lora.py", line 803, in main

RuntimeError: Distributed package doesn't have NCCL built inaccelerator = Accelerator(

File "C:\Users\nsg\stable-diffusion-webui\venv\lib\site-packages\accelerate\accelerator.py", line 358, in init self.state = AcceleratorState( File "C:\Users\nsg\stable-diffusion-webui\venv\lib\site-packages\accelerate\state.py", line 720, in init PartialState(cpu, kwargs) File "C:\Users\nsg\stable-diffusion-webui\venv\lib\site-packages\accelerate\state.py", line 192, in init torch.distributed.init_process_group(backend=self.backend, kwargs) File "C:\Users\nsg\stable-diffusion-webui\venv\lib\site-packages\torch\distributed\distributed_c10d.py", line 907, in init_process_group default_pg = _new_process_group_helper( File "C:\Users\nsg\stable-diffusion-webui\venv\lib\site-packages\torch\distributed\distributed_c10d.py", line 1013, in _new_process_group_helper raise RuntimeError("Distributed package doesn't have NCCL " "built in") RuntimeError: Distributed package doesn't have NCCL built in ERROR:torch.distributed.elastic.multiprocessing.api:failed (exitcode: 1) local_rank: 0 (pid: 4880) of binary: C:\Users\nsg\stable-diffusion-webui\venv\Scripts\python.exe Traceback (most recent call last): File "C:\Users\nsg\AppData\Local\Programs\Python\Python310\lib\runpy.py", line 196, in _run_module_as_main return _run_code(code, main_globals, None, File "C:\Users\nsg\AppData\Local\Programs\Python\Python310\lib\runpy.py", line 86, in _run_code exec(code, run_globals) File "C:\Users\nsg\stable-diffusion-webui\venv\lib\site-packages\accelerate\commands\launch.py", line 989, in main() File "C:\Users\nsg\stable-diffusion-webui\venv\lib\site-packages\accelerate\commands\launch.py", line 985, in main launch_command(args) File "C:\Users\nsg\stable-diffusion-webui\venv\lib\site-packages\accelerate\commands\launch.py", line 970, in launch_command multi_gpu_launcher(args) File "C:\Users\nsg\stable-diffusion-webui\venv\lib\site-packages\accelerate\commands\launch.py", line 646, in multi_gpu_launcher distrib_run.run(args) File "C:\Users\nsg\stable-diffusion-webui\venv\lib\site-packages\torch\distributed\run.py", line 785, in run elastic_launch( File "C:\Users\nsg\stable-diffusion-webui\venv\lib\site-packages\torch\distributed\launcher\api.py", line 134, in call return launch_agent(self._config, self._entrypoint, list(args)) File "C:\Users\nsg\stable-diffusion-webui\venv\lib\site-packages\torch\distributed\launcher\api.py", line 250, in launch_agent raise ChildFailedError( torch.distributed.elastic.multiprocessing.errors.ChildFailedError:

C:\Users\nsg\stable-diffusion-webui\extensions\sd-webui-EasyPhoto\scripts\train_kohya/train_lora.py FAILED

Failures: [1]: time : 2023-09-06_17:00:46 host : USER-20230706TY rank : 1 (local_rank: 1) exitcode : 1 (pid: 9704) error_file: <N/A> traceback : To enable traceback see: https://pytorch.org/docs/stable/elastic/errors.html

Root Cause (first observed failure): [0]: time : 2023-09-06_17:00:46 host : USER-20230706TY rank : 0 (local_rank: 0) exitcode : 1 (pid: 4880) error_file: <N/A> traceback : To enable traceback see: https://pytorch.org/docs/stable/elastic/errors.html

Error executing the command: Command '['C:\Users\nsg\stable-diffusion-webui\venv\Scripts\python.exe', '-m', 'accelerate.commands.launch', '--mixed_precision=fp16', '--main_process_port=3456', 'C:\Users\nsg\stable-diffusion-webui\extensions\sd-webui-EasyPhoto\scripts\train_kohya/train_lora.py', '--pretrained_model_name_or_path=extensions\sd-webui-EasyPhoto\models\stable-diffusion-v1-5', '--pretrained_model_ckpt=models\Stable-diffusion\Chilloutmix-Ni-pruned-fp16-fix.safetensors', '--train_data_dir=outputs\easyphoto-user-id-infos\jacky-5\processed_images', '--caption_column=text', '--resolution=512', '--random_flip', '--train_batch_size=1', '--gradient_accumulation_steps=4', '--dataloader_num_workers=0', '--max_train_steps=800', '--checkpointing_steps=100', '--learning_rate=0.0001', '--lr_scheduler=constant', '--lr_warmup_steps=0', '--train_text_encoder', '--seed=42', '--rank=128', '--network_alpha=64', '--validation_prompt=easyphoto_face, easyphoto, 1person', '--validation_steps=100', '--output_dir=outputs\easyphoto-user-id-infos\jacky-5\user_weights', '--logging_dir=outputs\easyphoto-user-id-infos\jacky-5\user_weights', '--enable_xformers_memory_efficient_attention', '--mixed_precision=fp16', '--template_dir=extensions\sd-webui-EasyPhoto\models\training_templates', '--template_mask', '--merge_best_lora_based_face_id', '--merge_best_lora_name=jacky-5']' returned non-zero exit status 1. Traceback (most recent call last): File "C:\Users\nsg\stable-diffusion-webui\venv\lib\site-packages\gradio\routes.py", line 488, in run_predict output = await app.get_blocks().process_api( File "C:\Users\nsg\stable-diffusion-webui\venv\lib\site-packages\gradio\blocks.py", line 1431, in process_api result = await self.call_function( File "C:\Users\nsg\stable-diffusion-webui\venv\lib\site-packages\gradio\blocks.py", line 1103, in call_function prediction = await anyio.to_thread.run_sync( File "C:\Users\nsg\stable-diffusion-webui\venv\lib\site-packages\anyio\to_thread.py", line 33, in run_sync return await get_asynclib().run_sync_in_worker_thread( File "C:\Users\nsg\stable-diffusion-webui\venv\lib\site-packages\anyio_backends_asyncio.py", line 877, in run_sync_in_worker_thread return await future File "C:\Users\nsg\stable-diffusion-webui\venv\lib\site-packages\anyio_backends_asyncio.py", line 807, in run result = context.run(func, args) File "C:\Users\nsg\stable-diffusion-webui\venv\lib\site-packages\gradio\utils.py", line 707, in wrapper response = f(args, **kwargs) File "C:\Users\nsg\stable-diffusion-webui\extensions\sd-webui-EasyPhoto\scripts\easyphoto_train.py", line 218, in easyphoto_train_forward copyfile(best_weight_path, webui_save_path) File "C:\Users\nsg\AppData\Local\Programs\Python\Python310\lib\shutil.py", line 254, in copyfile with open(src, 'rb') as fsrc: FileNotFoundError: [Errno 2] No such file or directory: 'C:\Users\nsg\stable-diffusion-webui\outputs/easyphoto-user-id-infos\jacky-5\user_weights\best_outputs/jacky-5.safetensors'

bubbliiiing commented 1 year ago

image 我看你似乎使用nccl,这是只有多卡的时候才会使用到,能否启动webui的时候只指定单卡呢?

bubbliiiing commented 1 year ago

image_portrait_enhancement_pipeline

你好,这里看来是超分的时候爆显存了,是不是因为原始图片的分辨率较大呢?

wuziheng commented 1 year ago

@newstargo 和ISSUE30, ISSUE31 是不是可以合并处理,并在下个README做明显提示。 PR20 已修复这个问题,如果有时间可以拉取最新代码进行测试,后续我们会在24h内关闭这一ISSUE