Closed longglecc closed 6 months ago
ๆ็ๆฏtensorRTๆฒกๆๅฎ่ฃ ็ๅๅ ๏ผๆๆๅจๅฎ่ฃ ไบtensorRT่ฟๆฏ็ธๅ็้ฎ้ข
torch็ๆฌๅtf็ๆฌ
ไฝๆฏๅฆๆๆฏtensorRT็้ฎ้ข๏ผ้ฃไธไผ็ปง็ปญๆง่กไบๅ๏ผๆ็ๅฝไปค่กๅ็ปญ่ฟๅจ็ปง็ปญ่พๅบ๏ผ ๅนถๆฒกๆ็ๅบๆฏๅช้ๅบ็ฐไบ้ฎ้ข๏ผๆๅๅฐฑๅคฑ่ดฅไบ
ๆๅๆxformers ไนๅฎ่ฃ ไธไบ๏ผๆฅ็้่ฏฏ่ฟๆฏไธๆ ท็
ๆพๅฐๅๅ ไบ
Is there an existing issue for this?
Is EasyPhoto the latest version?
What happened?
2024-03-15 08:14:15.067244000 [E:onnxruntime:Default, provider_bridge_ort.cc:1534 TryGetProviderInfo_TensorRT] /onnxruntime_src/onnxruntime/core/session/provider_bridge_ort.cc:1209 onnxruntime::Provider& onnxruntime::ProviderLibrary::Get() [ONNXRuntimeError] : 1 : FAIL : Failed to load library libonnxruntime_providers_tensorrt.so with error: libnvinfer.so.8: cannot open shared object file: No such file or directory
EP Error EP Error /onnxruntime_src/onnxruntime/python/onnxruntime_pybind_state.cc:456 void onnxruntime::python::RegisterTensorRTPluginsAsCustomOps(onnxruntime::python::PySessionOptions&, const ProviderOptions&) Please install TensorRT libraries as mentioned in the GPU requirements page, make sure they're in the PATH or LD_LIBRARY_PATH, and that your GPU is supported. when using ['TensorrtExecutionProvider', 'CUDAExecutionProvider', 'CPUExecutionProvider'] Falling back to ['CUDAExecutionProvider', 'CPUExecutionProvider'] and retrying.
2024-03-15 08:14:15.180632383 [W:onnxruntime:, graph.cc:3593 CleanUnusedInitializersAndNodeArgs] Removing initializer 'Sub1664:0'. It is not used by any node and should be removed from the model. 2024-03-15 08:14:15.180648438 [W:onnxruntime:, graph.cc:3593 CleanUnusedInitializersAndNodeArgs] Removing initializer 'Shape1662:0'. It is not used by any node and should be removed from the model. 2024-03-15 08:14:15.180657188 [W:onnxruntime:, graph.cc:3593 CleanUnusedInitializersAndNodeArgs] Removing initializer 'const_4__1660'. It is not used by any node and should be removed from the model. 2024-03-15 08:14:15.241504061 [W:onnxruntime:, cuda_execution_provider.cc:2319 ConvTransposeNeedFallbackToCPU] Dropping the ConvTranspose node: cond_1/coarse/Conv2d_transpose/BiasAdd to CPU because it requires asymmetric padding which the CUDA EP currently does not support 2024-03-15 08:14:15.241513052 [W:onnxruntime:, cuda_execution_provider.cc:2426 GetCapability] CUDA kernel not supported. Fallback to CPU execution provider for Op type: ConvTranspose node name: cond_1/coarse/Conv2d_transpose/BiasAdd 2024-03-15 08:14:15.241747033 [W:onnxruntime:, cuda_execution_provider.cc:2319 ConvTransposeNeedFallbackToCPU] Dropping the ConvTranspose node: cond_1/coarse/Conv2d_transpose_1/BiasAdd to CPU because it requires asymmetric padding which the CUDA EP currently does not support 2024-03-15 08:14:15.241750607 [W:onnxruntime:, cuda_execution_provider.cc:2426 GetCapability] CUDA kernel not supported. Fallback to CPU execution provider for Op type: ConvTranspose node name: cond_1/coarse/Conv2d_transpose_1/BiasAdd 2024-03-15 08:14:15.241983280 [W:onnxruntime:, cuda_execution_provider.cc:2319 ConvTransposeNeedFallbackToCPU] Dropping the ConvTranspose node: cond_1/coarse/Conv2d_transpose_2/BiasAdd to CPU because it requires asymmetric padding which the CUDA EP currently does not support 2024-03-15 08:14:15.241986457 [W:onnxruntime:, cuda_execution_provider.cc:2426 GetCapability] CUDA kernel not supported. Fallback to CPU execution provider for Op type: ConvTranspose node name: cond_1/coarse/Conv2d_transpose_2/BiasAdd 2024-03-15 08:14:15.242220288 [W:onnxruntime:, cuda_execution_provider.cc:2319 ConvTransposeNeedFallbackToCPU] Dropping the ConvTranspose node: cond_1/coarse/Conv2d_transpose_3/BiasAdd to CPU because it requires asymmetric padding which the CUDA EP currently does not support 2024-03-15 08:14:15.242223572 [W:onnxruntime:, cuda_execution_provider.cc:2426 GetCapability] CUDA kernel not supported. Fallback to CPU execution provider for Op type: ConvTranspose node name: cond_1/coarse/Conv2d_transpose_3/BiasAdd 2024-03-15 08:14:15.242406062 [W:onnxruntime:, cuda_execution_provider.cc:2319 ConvTransposeNeedFallbackToCPU] Dropping the ConvTranspose node: cond_1/coarse/Conv2d_transpose_4/BiasAdd to CPU because it requires asymmetric padding which the CUDA EP currently does not support 2024-03-15 08:14:15.242409303 [W:onnxruntime:, cuda_execution_provider.cc:2426 GetCapability] CUDA kernel not supported. Fallback to CPU execution provider for Op type: ConvTranspose node name: cond_1/coarse/Conv2d_transpose_4/BiasAdd 2024-03-15 08:14:15.242623624 [W:onnxruntime:, cuda_execution_provider.cc:2319 ConvTransposeNeedFallbackToCPU] Dropping the ConvTranspose node: cond_1/refine/up_conv1/conv2d_transpose to CPU because it requires asymmetric padding which the CUDA EP currently does not support 2024-03-15 08:14:15.242627156 [W:onnxruntime:, cuda_execution_provider.cc:2426 GetCapability] CUDA kernel not supported. Fallback to CPU execution provider for Op type: ConvTranspose node name: cond_1/refine/up_conv1/conv2d_transpose 2024-03-15 08:14:15.242634661 [W:onnxruntime:, cuda_execution_provider.cc:2319 ConvTransposeNeedFallbackToCPU] Dropping the ConvTranspose node: cond_1/refine/up_conv2/conv2d_transpose to CPU because it requires asymmetric padding which the CUDA EP currently does not support 2024-03-15 08:14:15.242637447 [W:onnxruntime:, cuda_execution_provider.cc:2426 GetCapability] CUDA kernel not supported. Fallback to CPU execution provider for Op type: ConvTranspose node name: cond_1/refine/up_conv2/conv2d_transpose 2024-03-15 08:14:15.242644868 [W:onnxruntime:, cuda_execution_provider.cc:2319 ConvTransposeNeedFallbackToCPU] Dropping the ConvTranspose node: cond_1/refine/up_conv3/conv2d_transpose to CPU because it requires asymmetric padding which the CUDA EP currently does not support 2024-03-15 08:14:15.242647795 [W:onnxruntime:, cuda_execution_provider.cc:2426 GetCapability] CUDA kernel not supported. Fallback to CPU execution provider for Op type: ConvTranspose node name: cond_1/refine/up_conv3/conv2d_transpose 2024-03-15 08:14:15.242655069 [W:onnxruntime:, cuda_execution_provider.cc:2319 ConvTransposeNeedFallbackToCPU] Dropping the ConvTranspose node: cond_1/refine/up_conv4/conv2d_transpose to CPU because it requires asymmetric padding which the CUDA EP currently does not support 2024-03-15 08:14:15.242657894 [W:onnxruntime:, cuda_execution_provider.cc:2426 GetCapability] CUDA kernel not supported. Fallback to CPU execution provider for Op type: ConvTranspose node name: cond_1/refine/up_conv4/conv2d_transpose 2024-03-15 08:14:15.242666008 [W:onnxruntime:, cuda_execution_provider.cc:2319 ConvTransposeNeedFallbackToCPU] Dropping the ConvTranspose node: cond_1/refine/up_conv5/conv2d_transpose to CPU because it requires asymmetric padding which the CUDA EP currently does not support 2024-03-15 08:14:15.242669226 [W:onnxruntime:, cuda_execution_provider.cc:2426 GetCapability] CUDA kernel not supported. Fallback to CPU execution provider for Op type: ConvTranspose node name: cond_1/refine/up_conv5/conv2d_transpose 2024-03-15 08:14:15.245905512 [W:onnxruntime:, cuda_execution_provider.cc:2319 ConvTransposeNeedFallbackToCPU] Dropping the ConvTranspose node: detect/Conv2d_transpose/BiasAdd to CPU because it requires asymmetric padding which the CUDA EP currently does not support 2024-03-15 08:14:15.245909776 [W:onnxruntime:, cuda_execution_provider.cc:2426 GetCapability] CUDA kernel not supported. Fallback to CPU execution provider for Op type: ConvTranspose node name: detect/Conv2d_transpose/BiasAdd 2024-03-15 08:14:15.246154186 [W:onnxruntime:, cuda_execution_provider.cc:2319 ConvTransposeNeedFallbackToCPU] Dropping the ConvTranspose node: detect/Conv2d_transpose_1/BiasAdd to CPU because it requires asymmetric padding which the CUDA EP currently does not support 2024-03-15 08:14:15.246157716 [W:onnxruntime:, cuda_execution_provider.cc:2426 GetCapability] CUDA kernel not supported. Fallback to CPU execution provider for Op type: ConvTranspose node name: detect/Conv2d_transpose_1/BiasAdd 2024-03-15 08:14:15.246398343 [W:onnxruntime:, cuda_execution_provider.cc:2319 ConvTransposeNeedFallbackToCPU] Dropping the ConvTranspose node: detect/Conv2d_transpose_2/BiasAdd to CPU because it requires asymmetric padding which the CUDA EP currently does not support 2024-03-15 08:14:15.246401448 [W:onnxruntime:, cuda_execution_provider.cc:2426 GetCapability] CUDA kernel not supported. Fallback to CPU execution provider for Op type: ConvTranspose node name: detect/Conv2d_transpose_2/BiasAdd 2024-03-15 08:14:15.246635291 [W:onnxruntime:, cuda_execution_provider.cc:2319 ConvTransposeNeedFallbackToCPU] Dropping the ConvTranspose node: detect/Conv2d_transpose_3/BiasAdd to CPU because it requires asymmetric padding which the CUDA EP currently does not support 2024-03-15 08:14:15.246638579 [W:onnxruntime:, cuda_execution_provider.cc:2426 GetCapability] CUDA kernel not supported. Fallback to CPU execution provider for Op type: ConvTranspose node name: detect/Conv2d_transpose_3/BiasAdd 2024-03-15 08:14:15.246817512 [W:onnxruntime:, cuda_execution_provider.cc:2319 ConvTransposeNeedFallbackToCPU] Dropping the ConvTranspose node: detect/Conv2d_transpose_4/BiasAdd to CPU because it requires asymmetric padding which the CUDA EP currently does not support 2024-03-15 08:14:15.246820719 [W:onnxruntime:, cuda_execution_provider.cc:2426 GetCapability] CUDA kernel not supported. Fallback to CPU execution provider for Op type: ConvTranspose node name: detect/Conv2d_transpose_4/BiasAdd 2024-03-15 08:14:15.282360569 [W:onnxruntime:, transformer_memcpy.cc:74 ApplyImpl] 21 Memcpy nodes are added to the graph tf2onnx for CUDAExecutionProvider. It might have negative impact on performance (including unable to run CUDA graph). Set session_options.log_severity_level=1 to see the detail logs before this message. 2024-03-15 08:14:15.288723331 [W:onnxruntime:, transformer_memcpy.cc:74 ApplyImpl] 29 Memcpy nodes are added to the graph tf2onnx223 for CUDAExecutionProvider. It might have negative impact on performance (including unable to run CUDA graph). Set session_options.log_severity_level=1 to see the detail logs before this message. 2024-03-15 08:14:15.289002931 [W:onnxruntime:, transformer_memcpy.cc:74 ApplyImpl] 5 Memcpy nodes are added to the graph tf2onnx547 for CUDAExecutionProvider. It might have negative impact on performance (including unable to run CUDA graph). Set session_options.log_severity_level=1 to see the detail logs before this message. 2024-03-15 08:14:15.289262372 [W:onnxruntime:, transformer_memcpy.cc:74 ApplyImpl] 5 Memcpy nodes are added to the graph tf2onnx__485 for CUDAExecutionProvider. It might have negative impact on performance (including unable to run CUDA graph). Set session_options.log_severity_level=1 to see the detail logs before this message. 2024-03-15 08:14:16,137 - modelscope - INFO - Use user-specified model revision: v1.0.0 2024-03-15 08:14:16,380 - modelscope - INFO - initiate model from /home/gaol/.cache/modelscope/hub/damo/cv_gpen_image-portrait-enhancement 2024-03-15 08:14:16,380 - modelscope - INFO - initiate model from location /home/gaol/.cache/modelscope/hub/damo/cv_gpen_image-portrait-enhancement. 2024-03-15 08:14:16,381 - modelscope - INFO - initialize model from /home/gaol/.cache/modelscope/hub/damo/cv_gpen_image-portrait-enhancement Loading ResNet ArcFace 2024-03-15 08:14:18,036 - modelscope - INFO - load face enhancer model done 2024-03-15 08:14:18,283 - modelscope - INFO - load face detector model done 2024-03-15 08:14:18,526 - modelscope - INFO - load sr model done 2024-03-15 08:14:19,245 - modelscope - INFO - load fqa model done 0%| | 0/15 [00:00<?, ?it/s]2024-03-15 08:14:19,686 - modelscope - WARNING - task skin-retouching-torch input definition is missing 2024-03-15 08:14:23,763 - modelscope - WARNING - task skin-retouching-torch output keys are missing 2024-03-15 08:14:23,767 - modelscope - WARNING - task face_recognition input definition is missing 2024-03-15 08:14:23,911 - modelscope - INFO - model inference done 2024-03-15 08:14:23,911 - modelscope - WARNING - task face_recognition output keys are missing 7%|โโโโโโโโโโโ | 1/15 [00:04<01:04, 4.64s/it]2024-03-15 08:14:24,554 - modelscope - INFO - model inference done 13%|โโโโโโโโโโโโโโโโโโโโโ | 2/15 [00:05<00:29, 2.29s/it]2024-03-15 08:14:25,148 - modelscope - INFO - model inference done 20%|โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ | 3/15 [00:05<00:18, 1.51s/it]2024-03-15 08:14:25,792 - modelscope - INFO - model inference done 27%|โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ | 4/15 [00:06<00:12, 1.17s/it]2024-03-15 08:14:26,642 - modelscope - INFO - model inference done 33%|โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ | 5/15 [00:07<00:10, 1.06s/it]2024-03-15 08:14:27,205 - modelscope - INFO - model inference done 40%|โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ | 6/15 [00:07<00:07, 1.13it/s]2024-03-15 08:14:27,831 - modelscope - INFO - model inference done 47%|โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ | 7/15 [00:08<00:06, 1.25it/s]2024-03-15 08:14:28,921 - modelscope - INFO - model inference done 53%|โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ | 8/15 [00:09<00:06, 1.12it/s]2024-03-15 08:14:29,875 - modelscope - INFO - model inference done 60%|โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ | 9/15 [00:10<00:05, 1.10it/s]2024-03-15 08:14:30,772 - modelscope - INFO - model inference done 67%|โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ | 10/15 [00:11<00:04, 1.10it/s]2024-03-15 08:14:31,635 - modelscope - INFO - model inference done 73%|โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ | 11/15 [00:12<00:03, 1.12it/s]2024-03-15 08:14:32,752 - modelscope - INFO - model inference done 80%|โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ | 12/15 [00:13<00:02, 1.04it/s]2024-03-15 08:14:33,692 - modelscope - INFO - model inference done 87%|โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ | 13/15 [00:14<00:01, 1.05it/s]2024-03-15 08:14:34,291 - modelscope - INFO - model inference done 93%|โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ | 14/15 [00:15<00:00, 1.18it/s]2024-03-15 08:14:34,920 - modelscope - INFO - model inference done 100%|โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ| 15/15 [00:15<00:00, 1.04s/it] selected paths: /home/gaol/codes/temp/stable-diffusion-webui/outputs/easyphoto-user-id-infos/lifan/original_backup/0.jpg total scores: 0.5048029519210812 face angles 0.9206686066500163 selected paths: /home/gaol/codes/temp/stable-diffusion-webui/outputs/easyphoto-user-id-infos/lifan/original_backup/12.jpg total scores: 0.49235572689416074 face angles 0.9208677161732743 selected paths: /home/gaol/codes/temp/stable-diffusion-webui/outputs/easyphoto-user-id-infos/lifan/original_backup/6.jpg total scores: 0.4757991017879283 face angles 0.9655199974287817 selected paths: /home/gaol/codes/temp/stable-diffusion-webui/outputs/easyphoto-user-id-infos/lifan/original_backup/2.jpg total scores: 0.47179465965092976 face angles 0.9158875840641704 selected paths: /home/gaol/codes/temp/stable-diffusion-webui/outputs/easyphoto-user-id-infos/lifan/original_backup/13.jpg total scores: 0.46459175877098685 face angles 0.9542150597199701 selected paths: /home/gaol/codes/temp/stable-diffusion-webui/outputs/easyphoto-user-id-infos/lifan/original_backup/4.jpg total scores: 0.4373414103114538 face angles 0.8650355070487793 selected paths: /home/gaol/codes/temp/stable-diffusion-webui/outputs/easyphoto-user-id-infos/lifan/original_backup/7.jpg total scores: 0.4354206394466526 face angles 0.9829593007625231 selected paths: /home/gaol/codes/temp/stable-diffusion-webui/outputs/easyphoto-user-id-infos/lifan/original_backup/11.jpg total scores: 0.4197791398965481 face angles 0.7875377991892176 selected paths: /home/gaol/codes/temp/stable-diffusion-webui/outputs/easyphoto-user-id-infos/lifan/original_backup/8.jpg total scores: 0.41548011881379737 face angles 0.9469831125519275 selected paths: /home/gaol/codes/temp/stable-diffusion-webui/outputs/easyphoto-user-id-infos/lifan/original_backup/5.jpg total scores: 0.39574758521373066 face angles 0.9770553817546423 selected paths: /home/gaol/codes/temp/stable-diffusion-webui/outputs/easyphoto-user-id-infos/lifan/original_backup/14.jpg total scores: 0.38421667143713195 face angles 0.9993536825291083 selected paths: /home/gaol/codes/temp/stable-diffusion-webui/outputs/easyphoto-user-id-infos/lifan/original_backup/10.jpg total scores: 0.3491402003211737 face angles 0.739845540721069 selected paths: /home/gaol/codes/temp/stable-diffusion-webui/outputs/easyphoto-user-id-infos/lifan/original_backup/9.jpg total scores: 0.3327516258991929 face angles 0.8863734503059482 selected paths: /home/gaol/codes/temp/stable-diffusion-webui/outputs/easyphoto-user-id-infos/lifan/original_backup/1.jpg total scores: 0.31481701863111694 face angles 0.6169483135101612 selected paths: /home/gaol/codes/temp/stable-diffusion-webui/outputs/easyphoto-user-id-infos/lifan/original_backup/3.jpg total scores: 0.3134098934345057 face angles 0.8504926798977891 jpg: 0.jpg face_id_scores 0.5048029519210812 jpg: 12.jpg face_id_scores 0.49235572689416074 jpg: 11.jpg face_id_scores 0.4197791398965481 jpg: 2.jpg face_id_scores 0.47179465965092976 jpg: 1.jpg face_id_scores 0.31481701863111694 jpg: 4.jpg face_id_scores 0.4373414103114538 jpg: 6.jpg face_id_scores 0.4757991017879283 jpg: 13.jpg face_id_scores 0.46459175877098685 jpg: 10.jpg face_id_scores 0.3491402003211737 jpg: 7.jpg face_id_scores 0.4354206394466526 jpg: 8.jpg face_id_scores 0.41548011881379737 jpg: 5.jpg face_id_scores 0.39574758521373066 jpg: 14.jpg face_id_scores 0.38421667143713195 jpg: 9.jpg face_id_scores 0.3327516258991929 jpg: 3.jpg face_id_scores 0.3134098934345057 15it [00:12, 1.16it/s] save processed image to /home/gaol/codes/temp/stable-diffusion-webui/outputs/easyphoto-user-id-infos/lifan/processed_images/train/0.jpg save processed image to /home/gaol/codes/temp/stable-diffusion-webui/outputs/easyphoto-user-id-infos/lifan/processed_images/train/1.jpg save processed image to /home/gaol/codes/temp/stable-diffusion-webui/outputs/easyphoto-user-id-infos/lifan/processed_images/train/2.jpg save processed image to /home/gaol/codes/temp/stable-diffusion-webui/outputs/easyphoto-user-id-infos/lifan/processed_images/train/3.jpg save processed image to /home/gaol/codes/temp/stable-diffusion-webui/outputs/easyphoto-user-id-infos/lifan/processed_images/train/4.jpg save processed image to /home/gaol/codes/temp/stable-diffusion-webui/outputs/easyphoto-user-id-infos/lifan/processed_images/train/5.jpg save processed image to /home/gaol/codes/temp/stable-diffusion-webui/outputs/easyphoto-user-id-infos/lifan/processed_images/train/6.jpg save processed image to /home/gaol/codes/temp/stable-diffusion-webui/outputs/easyphoto-user-id-infos/lifan/processed_images/train/7.jpg save processed image to /home/gaol/codes/temp/stable-diffusion-webui/outputs/easyphoto-user-id-infos/lifan/processed_images/train/8.jpg save processed image to /home/gaol/codes/temp/stable-diffusion-webui/outputs/easyphoto-user-id-infos/lifan/processed_images/train/9.jpg save processed image to /home/gaol/codes/temp/stable-diffusion-webui/outputs/easyphoto-user-id-infos/lifan/processed_images/train/10.jpg save processed image to /home/gaol/codes/temp/stable-diffusion-webui/outputs/easyphoto-user-id-infos/lifan/processed_images/train/11.jpg save processed image to /home/gaol/codes/temp/stable-diffusion-webui/outputs/easyphoto-user-id-infos/lifan/processed_images/train/12.jpg save processed image to /home/gaol/codes/temp/stable-diffusion-webui/outputs/easyphoto-user-id-infos/lifan/processed_images/train/13.jpg save processed image to /home/gaol/codes/temp/stable-diffusion-webui/outputs/easyphoto-user-id-infos/lifan/processed_images/train/14.jpg 2024-03-15 08:14:48,471 - EasyPhoto - train_file_path : /home/gaol/codes/temp/stable-diffusion-webui/extensions/sd-webui-EasyPhoto/scripts/train_kohya/train_lora.py 2024-03-15 08:14:48,472 - EasyPhoto - cache_log_file_path: /home/gaol/codes/temp/stable-diffusion-webui/outputs/easyphoto-tmp/train_kohya_log.txt 2024-03-15 08:14:53,567 - modelscope - INFO - PyTorch version 2.0.1+cu118 Found. 2024-03-15 08:14:53,569 - modelscope - INFO - TensorFlow version 2.16.1 Found. 2024-03-15 08:14:53,569 - modelscope - INFO - Loading ast index from /home/gaol/.cache/modelscope/ast_indexer 2024-03-15 08:14:53,585 - modelscope - INFO - Loading done! Current index file version is 1.9.3, with md5 985d60ab3829178ada728d5649a2ffda and a total number of 943 components indexed 03/15/2024 08:14:54 - INFO - main - Distributed environment: MULTI_GPU Backend: nccl Num processes: 1 Process index: 0 Local process index: 0 Device: cuda:0
Mixed precision type: fp16
{'variance_type', 'prediction_type', 'thresholding', 'clip_sample_range', 'timestep_spacing', 'sample_max_value', 'dynamic_thresholding_ratio'} was not found in config. Values will be initialized to default values. UNet2DConditionModel: 64, 8, 768, False, False loading u-net:
loading vae:
loading text encoder:
create LoRA network. base dim (rank): 128, alpha: 64
neuron dropout: p=None, rank dropout: p=None, module dropout: p=None
create LoRA for Text Encoder:
create LoRA for Text Encoder: 72 modules.
create LoRA for U-Net: 192 modules.
enable LoRA for text encoder
enable LoRA for U-Net
/home/gaol/codes/temp/stable-diffusion-webui/extensions/sd-webui-EasyPhoto/scripts/train_kohya/train_lora.py:792: DeprecationWarning: The 'warn' method is deprecated, use 'warning' instead
logger.warn("xformers is not available. Make sure it is installed correctly")
03/15/2024 08:15:00 - WARNING - main - xformers is not available. Make sure it is installed correctly
Resolving data files: 100%|โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ| 31/31 [00:00<00:00, 449908.04it/s]
Downloading and preparing dataset imagefolder/default to /home/gaol/.cache/huggingface/datasets/imagefolder/default-f3c5867687810c1c/0.0.0/37fbb85cc714a338bea574ac6c7d0b5be5aff46c1862c1989b20e0771199e93f...
Downloading data files: 100%|โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ| 16/16 [00:00<00:00, 101372.91it/s]
Downloading data files: 100%|โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ| 15/15 [00:00<00:00, 124337.08it/s]
Extracting data files: 100%|โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ| 15/15 [00:00<00:00, 6935.79it/s]
Dataset imagefolder downloaded and prepared to /home/gaol/.cache/huggingface/datasets/imagefolder/default-f3c5867687810c1c/0.0.0/37fbb85cc714a338bea574ac6c7d0b5be5aff46c1862c1989b20e0771199e93f. Subsequent calls will reuse this data.
100%|โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ| 1/1 [00:00<00:00, 1409.85it/s]
/home/gaol/codes/temp/stable-diffusion-webui/venv/lib/python3.10/site-packages/torch/utils/data/dataloader.py:560: UserWarning: This DataLoader will create 16 worker processes in total. Our suggested max number of worker in current system is 12, which is smaller than what this DataLoader is going to create. Please be aware that excessive worker creation might get DataLoader running slow or even freeze, lower the worker number to avoid potential slowness/freeze if necessary.
warnings.warn(_create_warning_msg(
03/15/2024 08:15:02 - INFO - main - Running training
03/15/2024 08:15:02 - INFO - main - Num examples = 15
03/15/2024 08:15:02 - INFO - main - Num Epochs = 3000
03/15/2024 08:15:02 - INFO - main - Instantaneous batch size per device = 8
03/15/2024 08:15:02 - INFO - main - Total train batch size (w. parallel, distributed & accumulation) = 32
03/15/2024 08:15:02 - INFO - main - Gradient Accumulation steps = 4
03/15/2024 08:15:02 - INFO - main - Total optimization steps = 3000
Steps: 0%| | 0/3000 [00:00<?, ?it/s]2024-03-15 08:15:03,352 - modelscope - INFO - Model revision not specified, use revision: v2.0.2
2024-03-15 08:15:05,546 - modelscope - INFO - initiate model from /home/gaol/.cache/modelscope/hub/damo/cv_resnet50_face-detection_retinaface
2024-03-15 08:15:05,546 - modelscope - INFO - initiate model from location /home/gaol/.cache/modelscope/hub/damo/cv_resnet50_face-detection_retinaface.
2024-03-15 08:15:05,547 - modelscope - WARNING - No preprocessor field found in cfg.
2024-03-15 08:15:05,547 - modelscope - WARNING - No val key and type key found in preprocessor domain of configuration.json file.
2024-03-15 08:15:05,547 - modelscope - WARNING - Cannot find available config to build preprocessor at mode inference, current config: {'model_dir': '/home/gaol/.cache/modelscope/hub/damo/cv_resnet50_face-detection_retinaface'}. trying to build by task and model information.
2024-03-15 08:15:05,547 - modelscope - WARNING - Find task: face-detection, model type: None. Insufficient information to build preprocessor, skip building preprocessor
2024-03-15 08:15:05,548 - modelscope - INFO - loading model from /home/gaol/.cache/modelscope/hub/damo/cv_resnet50_face-detection_retinaface/pytorch_model.pt
/home/gaol/codes/temp/stable-diffusion-webui/venv/lib/python3.10/site-packages/torchvision/models/_utils.py:208: UserWarning: The parameter 'pretrained' is deprecated since 0.13 and may be removed in the future, please use 'weights' instead.
warnings.warn(
/home/gaol/codes/temp/stable-diffusion-webui/venv/lib/python3.10/site-packages/torchvision/models/_utils.py:223: UserWarning: Arguments other than a weight enum or
return tree_unflatten([fn(i) for i in flat_args], spec)
File "/home/gaol/codes/temp/stable-diffusion-webui/venv/lib/python3.10/site-packages/torch/utils/_pytree.py", line 247, in inner
return f(x)
File "/home/gaol/codes/temp/stable-diffusion-webui/venv/lib/python3.10/site-packages/torch/_subclasses/fake_tensor.py", line 1212, in validate
raise Exception(
Exception: Please convert all Tensors to FakeTensors first or instantiate FakeTensorMode with 'allow_non_fake_inputs'. Found in aten._to_copy.default( (Parameter containing:
tensor([[-0.0138, -0.0359, -0.0290, ..., 0.0133, 0.0205, -0.0051],
[-0.0148, -0.0244, 0.0315, ..., 0.0310, -0.0247, 0.0210],
[-0.0090, -0.0062, -0.0061, ..., 0.0337, 0.0276, 0.0345],
...,
[-0.0216, -0.0222, 0.0058, ..., 0.0171, 0.0139, 0.0286],
[-0.0131, -0.0117, 0.0049, ..., 0.0252, 0.0084, -0.0211],
[-0.0176, -0.0148, 0.0318, ..., 0.0353, -0.0111, -0.0319]],
device='cuda:0', requires_grad=True),), **{'dtype': torch.float16})
None
for 'weights' are deprecated since 0.13 and may be removed in the future. The current behavior is equivalent to passingweights=None
. warnings.warn(msg) 2024-03-15 08:15:05,796 - modelscope - INFO - load model done /home/gaol/codes/temp/stable-diffusion-webui/venv/lib/python3.10/site-packages/torch/nn/modules/conv.py:459: UserWarning: Applied workaround for CuDNN issue, install nvrtc.so (Triggered internally at ../aten/src/ATen/native/cudnn/Conv_v8.cpp:80.) return F.conv2d(input, weight, bias, self.stride, /home/gaol/codes/temp/stable-diffusion-webui/venv/lib/python3.10/site-packages/torch/utils/data/dataloader.py:560: UserWarning: This DataLoader will create 16 worker processes in total. Our suggested max number of worker in current system is 12, which is smaller than what this DataLoader is going to create. Please be aware that excessive worker creation might get DataLoader running slow or even freeze, lower the worker number to avoid potential slowness/freeze if necessary. warnings.warn(_create_warning_msg( [2024-03-15 08:15:10,905] torch._dynamo.symbolic_convert: [INFO] Step 1: torchdynamo start tracing forward [2024-03-15 08:15:11,105] torch._dynamo.symbolic_convert: [INFO] Step 1: torchdynamo start tracing call [2024-03-15 08:15:11,107] torch._dynamo.symbolic_convert: [INFO] Step 1: torchdynamo start tracing forward [2024-03-15 08:15:12,439] torch._dynamo.symbolic_convert: [INFO] Step 1: torchdynamo done tracing forward (RETURN_VALUE) [2024-03-15 08:15:12,456] torch._dynamo.output_graph: [INFO] Step 2: calling compiler function debug_wrapper Traceback (most recent call last): File "/home/gaol/codes/temp/stable-diffusion-webui/venv/lib/python3.10/site-packages/torch/_dynamo/output_graph.py", line 670, in call_user_compiler compiled_fn = compiler_fn(gm, self.fake_example_inputs()) File "/home/gaol/codes/temp/stable-diffusion-webui/venv/lib/python3.10/site-packages/torch/_dynamo/debug_utils.py", line 1055, in debug_wrapper compiled_gm = compiler_fn(gm, example_inputs) File "/home/gaol/codes/temp/stable-diffusion-webui/venv/lib/python3.10/site-packages/torch/init.py", line 1390, in call return compilefx(model, inputs_, config_patches=self.config) File "/home/gaol/codes/temp/stable-diffusion-webui/venv/lib/python3.10/site-packages/torch/_inductor/compile_fx.py", line 455, in compile_fx return aot_autograd( File "/home/gaol/codes/temp/stable-diffusion-webui/venv/lib/python3.10/site-packages/torch/_dynamo/backends/common.py", line 48, in compiler_fn cg = aot_module_simplified(gm, example_inputs, kwargs) File "/home/gaol/codes/temp/stable-diffusion-webui/venv/lib/python3.10/site-packages/torch/_functorch/aot_autograd.py", line 2822, in aot_module_simplified compiled_fn = create_aot_dispatcher_function( File "/home/gaol/codes/temp/stable-diffusion-webui/venv/lib/python3.10/site-packages/torch/_dynamo/utils.py", line 163, in time_wrapper r = func(*args, kwargs) File "/home/gaol/codes/temp/stable-diffusion-webui/venv/lib/python3.10/site-packages/torch/_functorch/aot_autograd.py", line 2515, in create_aot_dispatcher_function compiled_fn = compiler_fn(flat_fn, fake_flat_args, aot_config) File "/home/gaol/codes/temp/stable-diffusion-webui/venv/lib/python3.10/site-packages/torch/_functorch/aot_autograd.py", line 1676, in aot_wrapper_dedupe fw_metadata, _out = run_functionalized_fw_and_collect_metadata( File "/home/gaol/codes/temp/stable-diffusion-webui/venv/lib/python3.10/site-packages/torch/_functorch/aot_autograd.py", line 607, in inner flat_f_outs = f(flat_f_args) File "/home/gaol/codes/temp/stable-diffusion-webui/venv/lib/python3.10/site-packages/torch/_functorch/aot_autograd.py", line 2793, in functional_call out = Interpreter(mod).run(args[params_len:], kwargs) File "/home/gaol/codes/temp/stable-diffusion-webui/venv/lib/python3.10/site-packages/torch/fx/interpreter.py", line 136, in run self.env[node] = self.run_node(node) File "/home/gaol/codes/temp/stable-diffusion-webui/venv/lib/python3.10/site-packages/torch/fx/interpreter.py", line 177, in run_node return getattr(self, n.op)(n.target, args, kwargs) File "/home/gaol/codes/temp/stable-diffusion-webui/venv/lib/python3.10/site-packages/torch/fx/interpreter.py", line 294, in call_module return submod(*args, kwargs) File "/home/gaol/codes/temp/stable-diffusion-webui/venv/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl return forward_call(*args, kwargs) File "/home/gaol/codes/temp/stable-diffusion-webui/extensions/sd-webui-EasyPhoto/scripts/train_kohya/utils/lora_utils.py", line 140, in forward lx = self.lora_down(x) File "/home/gaol/codes/temp/stable-diffusion-webui/venv/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl return forward_call(*args, *kwargs) File "/home/gaol/codes/temp/stable-diffusion-webui/venv/lib/python3.10/site-packages/torch/nn/modules/linear.py", line 114, in forward return F.linear(input, self.weight, self.bias) File "/home/gaol/codes/temp/stable-diffusion-webui/venv/lib/python3.10/site-packages/torch/_inductor/overrides.py", line 38, in __torch_function__ return func(args, kwargs) File "/home/gaol/codes/temp/stable-diffusion-webui/venv/lib/python3.10/site-packages/torch/utils/_stats.py", line 20, in wrapper return fn(*args, *kwargs) File "/home/gaol/codes/temp/stable-diffusion-webui/venv/lib/python3.10/site-packages/torch/_subclasses/fake_tensor.py", line 987, in __torch_dispatch__ return self.dispatch(func, types, args, kwargs) File "/home/gaol/codes/temp/stable-diffusion-webui/venv/lib/python3.10/site-packages/torch/_subclasses/fake_tensor.py", line 1066, in dispatch args, kwargs = self.validate_and_convert_non_fake_tensors( File "/home/gaol/codes/temp/stable-diffusion-webui/venv/lib/python3.10/site-packages/torch/_subclasses/fake_tensor.py", line 1220, in validate_and_convert_non_fake_tensors return tree_map_only( File "/home/gaol/codes/temp/stable-diffusion-webui/venv/lib/python3.10/site-packages/torch/utils/_pytree.py", line 266, in tree_map_only return tree_map(map_only(ty)(fn), pytree) File "/home/gaol/codes/temp/stable-diffusion-webui/venv/lib/python3.10/site-packages/torch/utils/_pytree.py", line 196, in tree_map return tree_unflatten([fn(i) for i in flat_args], spec) File "/home/gaol/codes/temp/stable-diffusion-webui/venv/lib/python3.10/site-packages/torch/utils/_pytree.py", line 196, inWhile executing %self_text_model_encoder_layers_0_self_attn_q_proj : [#users=1] = call_module[target=self_text_model_encoder_layers_0_self_attn_q_proj](args = (%self_text_model_encoder_layers_0_layer_norm1,), kwargs = {}) Original traceback: File "/home/gaol/codes/temp/stable-diffusion-webui/venv/lib/python3.10/site-packages/transformers/models/clip/modeling_clip.py", line 272, in forward query_states = self.q_proj(hidden_states) * self.scale | File "/home/gaol/codes/temp/stable-diffusion-webui/venv/lib/python3.10/site-packages/transformers/models/clip/modeling_clip.py", line 383, in forward hidden_states, attn_weights = self.self_attn( | File "/home/gaol/codes/temp/stable-diffusion-webui/venv/lib/python3.10/site-packages/transformers/models/clip/modeling_clip.py", line 654, in forward layer_outputs = encoder_layer( | File "/home/gaol/codes/temp/stable-diffusion-webui/venv/lib/python3.10/site-packages/transformers/models/clip/modeling_clip.py", line 740, in forward encoder_outputs = self.encoder( | File "/home/gaol/codes/temp/stable-diffusion-webui/venv/lib/python3.10/site-packages/transformers/models/clip/modeling_clip.py", line 822, in forward return self.text_model(
The above exception was the direct cause of the following exception:
Traceback (most recent call last): File "/home/gaol/codes/temp/stable-diffusion-webui/extensions/sd-webui-EasyPhoto/scripts/train_kohya/train_lora.py", line 1397, in
main()
File "/home/gaol/codes/temp/stable-diffusion-webui/extensions/sd-webui-EasyPhoto/scripts/train_kohya/utils/gpu_info.py", line 190, in wrapper
result = func(*args, kwargs)
File "/home/gaol/codes/temp/stable-diffusion-webui/extensions/sd-webui-EasyPhoto/scripts/train_kohya/train_lora.py", line 1132, in main
encoder_hidden_states = text_encoder(batch["input_ids"])[0]
File "/home/gaol/codes/temp/stable-diffusion-webui/venv/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
return forward_call(*args, *kwargs)
File "/home/gaol/codes/temp/stable-diffusion-webui/venv/lib/python3.10/site-packages/torch/_dynamo/eval_frame.py", line 82, in forward
return self.dynamo_ctx(self._orig_mod.forward)(args, kwargs)
File "/home/gaol/codes/temp/stable-diffusion-webui/venv/lib/python3.10/site-packages/torch/_dynamo/eval_frame.py", line 209, in _fn
return fn(*args, kwargs)
File "/home/gaol/codes/temp/stable-diffusion-webui/venv/lib/python3.10/site-packages/accelerate/utils/operations.py", line 581, in forward
return model_forward(*args, *kwargs)
File "/home/gaol/codes/temp/stable-diffusion-webui/venv/lib/python3.10/site-packages/accelerate/utils/operations.py", line 569, in call
return convert_to_fp32(self.model_forward(args, kwargs))
File "/home/gaol/codes/temp/stable-diffusion-webui/venv/lib/python3.10/site-packages/torch/amp/autocast_mode.py", line 14, in decorate_autocast
return func(*args, kwargs)
File "/home/gaol/codes/temp/stable-diffusion-webui/venv/lib/python3.10/site-packages/torch/_dynamo/eval_frame.py", line 337, in catch_errors
return callback(frame, cache_size, hooks)
File "/home/gaol/codes/temp/stable-diffusion-webui/venv/lib/python3.10/site-packages/torch/_dynamo/convert_frame.py", line 404, in _convert_frame
result = inner_convert(frame, cache_size, hooks)
File "/home/gaol/codes/temp/stable-diffusion-webui/venv/lib/python3.10/site-packages/torch/_dynamo/convert_frame.py", line 104, in _fn
return fn(*args, *kwargs)
File "/home/gaol/codes/temp/stable-diffusion-webui/venv/lib/python3.10/site-packages/torch/_dynamo/convert_frame.py", line 262, in _convert_frame_assert
return _compile(
File "/home/gaol/codes/temp/stable-diffusion-webui/venv/lib/python3.10/site-packages/torch/_dynamo/utils.py", line 163, in time_wrapper
r = func(args, kwargs)
File "/home/gaol/codes/temp/stable-diffusion-webui/venv/lib/python3.10/site-packages/torch/_dynamo/convert_frame.py", line 324, in _compile
out_code = transform_code_object(code, transform)
File "/home/gaol/codes/temp/stable-diffusion-webui/venv/lib/python3.10/site-packages/torch/_dynamo/bytecode_transformation.py", line 445, in transform_code_object
transformations(instructions, code_options)
File "/home/gaol/codes/temp/stable-diffusion-webui/venv/lib/python3.10/site-packages/torch/_dynamo/convert_frame.py", line 311, in transform
tracer.run()
File "/home/gaol/codes/temp/stable-diffusion-webui/venv/lib/python3.10/site-packages/torch/_dynamo/symbolic_convert.py", line 1726, in run
super().run()
File "/home/gaol/codes/temp/stable-diffusion-webui/venv/lib/python3.10/site-packages/torch/_dynamo/symbolic_convert.py", line 576, in run
and self.step()
File "/home/gaol/codes/temp/stable-diffusion-webui/venv/lib/python3.10/site-packages/torch/_dynamo/symbolic_convert.py", line 540, in step
getattr(self, inst.opname)(inst)
File "/home/gaol/codes/temp/stable-diffusion-webui/venv/lib/python3.10/site-packages/torch/_dynamo/symbolic_convert.py", line 1792, in RETURN_VALUE
self.output.compile_subgraph(
File "/home/gaol/codes/temp/stable-diffusion-webui/venv/lib/python3.10/site-packages/torch/_dynamo/output_graph.py", line 541, in compile_subgraph
self.compile_and_call_fx_graph(tx, pass2.graph_output_vars(), root)
File "/home/gaol/codes/temp/stable-diffusion-webui/venv/lib/python3.10/site-packages/torch/_dynamo/output_graph.py", line 588, in compile_and_call_fx_graph
compiled_fn = self.call_user_compiler(gm)
File "/home/gaol/codes/temp/stable-diffusion-webui/venv/lib/python3.10/site-packages/torch/_dynamo/utils.py", line 163, in time_wrapper
r = func(*args, *kwargs)
File "/home/gaol/codes/temp/stable-diffusion-webui/venv/lib/python3.10/site-packages/torch/_dynamo/output_graph.py", line 675, in call_user_compiler
raise BackendCompilerFailed(self.compiler_fn, e) from e
torch._dynamo.exc.BackendCompilerFailed: debug_wrapper raised Exception: Please convert all Tensors to FakeTensors first or instantiate FakeTensorMode with 'allow_non_fake_inputs'. Found in aten._to_copy.default((Parameter containing:
tensor([[-0.0138, -0.0359, -0.0290, ..., 0.0133, 0.0205, -0.0051],
[-0.0148, -0.0244, 0.0315, ..., 0.0310, -0.0247, 0.0210],
[-0.0090, -0.0062, -0.0061, ..., 0.0337, 0.0276, 0.0345],
...,
[-0.0216, -0.0222, 0.0058, ..., 0.0171, 0.0139, 0.0286],
[-0.0131, -0.0117, 0.0049, ..., 0.0252, 0.0084, -0.0211],
[-0.0176, -0.0148, 0.0318, ..., 0.0353, -0.0111, -0.0319]],
device='cuda:0', requires_grad=True),), **{'dtype': torch.float16})
While executing %self_text_model_encoder_layers_0_self_attn_q_proj : [#users=1] = call_module[target=self_text_model_encoder_layers_0_self_attn_q_proj](args = (%self_text_model_encoder_layers_0_layer_norm1,), kwargs = {}) Original traceback: File "/home/gaol/codes/temp/stable-diffusion-webui/venv/lib/python3.10/site-packages/transformers/models/clip/modeling_clip.py", line 272, in forward query_states = self.q_proj(hidden_states) * self.scale | File "/home/gaol/codes/temp/stable-diffusion-webui/venv/lib/python3.10/site-packages/transformers/models/clip/modeling_clip.py", line 383, in forward hidden_states, attn_weights = self.self_attn( | File "/home/gaol/codes/temp/stable-diffusion-webui/venv/lib/python3.10/site-packages/transformers/models/clip/modeling_clip.py", line 654, in forward layer_outputs = encoder_layer( | File "/home/gaol/codes/temp/stable-diffusion-webui/venv/lib/python3.10/site-packages/transformers/models/clip/modeling_clip.py", line 740, in forward encoder_outputs = self.encoder( | File "/home/gaol/codes/temp/stable-diffusion-webui/venv/lib/python3.10/site-packages/transformers/models/clip/modeling_clip.py", line 822, in forward return self.text_model(
Set torch._dynamo.config.verbose=True for more information
You can suppress this exception and fall back to eager by setting: torch._dynamo.config.suppress_errors = True
Steps: 0%| | 0/3000 [00:11<?, ?it/s] ERROR:torch.distributed.elastic.multiprocessing.api:failed (exitcode: 1) local_rank: 0 (pid: 5161) of binary: /home/gaol/codes/temp/stable-diffusion-webui/venv/bin/python3 Traceback (most recent call last): File "/home/gaol/miniforge3/envs/stable/lib/python3.10/runpy.py", line 196, in _run_module_as_main return _run_code(code, main_globals, None, File "/home/gaol/miniforge3/envs/stable/lib/python3.10/runpy.py", line 86, in _run_code exec(code, run_globals) File "/home/gaol/codes/temp/stable-diffusion-webui/venv/lib/python3.10/site-packages/accelerate/commands/launch.py", line 989, in
main()
File "/home/gaol/codes/temp/stable-diffusion-webui/venv/lib/python3.10/site-packages/accelerate/commands/launch.py", line 985, in main
launch_command(args)
File "/home/gaol/codes/temp/stable-diffusion-webui/venv/lib/python3.10/site-packages/accelerate/commands/launch.py", line 970, in launch_command
multi_gpu_launcher(args)
File "/home/gaol/codes/temp/stable-diffusion-webui/venv/lib/python3.10/site-packages/accelerate/commands/launch.py", line 646, in multi_gpu_launcher
distrib_run.run(args)
File "/home/gaol/codes/temp/stable-diffusion-webui/venv/lib/python3.10/site-packages/torch/distributed/run.py", line 785, in run
elastic_launch(
File "/home/gaol/codes/temp/stable-diffusion-webui/venv/lib/python3.10/site-packages/torch/distributed/launcher/api.py", line 134, in call
return launch_agent(self._config, self._entrypoint, list(args))
File "/home/gaol/codes/temp/stable-diffusion-webui/venv/lib/python3.10/site-packages/torch/distributed/launcher/api.py", line 250, in launch_agent
raise ChildFailedError(
torch.distributed.elastic.multiprocessing.errors.ChildFailedError:
/home/gaol/codes/temp/stable-diffusion-webui/extensions/sd-webui-EasyPhoto/scripts/train_kohya/train_lora.py FAILED
Failures: