modelscope / facechain

FaceChain is a deep-learning toolchain for generating your Digital-Twin.
Apache License 2.0
9.02k stars 851 forks source link

Training Failed - implementation for device cuda:0 not found. #465

Closed BlazeCodeDev closed 5 months ago

BlazeCodeDev commented 11 months ago
** Setting base model to SD1.5 **
--------uuid:  qw
----------work_dir:  C:\Users\ralfl\Downloads\ai\stable-diffusion-webui\extensions\facechain\worker_data\qw\ly261666/cv_portrait_model\person1
2023-12-07 10:01:02,344 - modelscope - INFO - Use user-specified model revision: v1.0.0
2023-12-07 10:01:04,752 - modelscope - INFO - Use user-specified model revision: v1.0.0
2023-12-07 10:01:07,119 - modelscope - INFO - Use user-specified model revision: v1.0.0
2023-12-07 10:01:09,272 - modelscope - INFO - Use user-specified model revision: v1.0.0
2023-12-07 10:01:12,438 - modelscope - INFO - Use user-specified model revision: v1.0.0
2023-12-07 10:01:15,041 - modelscope - INFO - Use user-specified model revision: v1.0.0
2023-12-07 10:01:17,431 - modelscope - INFO - Use user-specified model revision: v1.0.0
2023-12-07 10:01:19,965 - modelscope - INFO - Use user-specified model revision: v1.0.0
2023-12-07 10:01:22,694 - modelscope - INFO - Use user-specified model revision: v1.0.0
2023-12-07 10:01:24,933 - modelscope - INFO - Use user-specified model revision: v1.0.0
bin C:\Users\ralfl\Downloads\ai\stable-diffusion-webui\venv\Lib\site-packages\bitsandbytes\libbitsandbytes_cuda118.dll
A matching Triton is not available, some optimizations will not be enabled.
Error caught was: No module named 'triton'
2023-12-07 10:02:05,052 - modelscope - INFO - PyTorch version 2.0.1+cu118 Found.
2023-12-07 10:02:05,056 - modelscope - INFO - Loading ast index from C:\Users\ralfl\.cache\modelscope\ast_indexer
2023-12-07 10:02:05,218 - modelscope - INFO - Loading done! Current index file version is 1.9.5, with md5 d5bd6a17a97bfa66f514e27ec01b558b and a total number of 945 components indexed
2023-12-07 10:02:12,635 - modelscope - INFO - Use user-specified model revision: v4.0
2023-12-07 10:02:17,100 - modelscope - INFO - Use user-specified model revision: v1.0.1
2023-12-07 10:02:17,757 - modelscope - WARNING - ('PIPELINES', 'skin-retouching-torch', 'skin-retouching-torch') not found in ast index file
2023-12-07 10:02:17,757 - modelscope - INFO - initiate model from C:\Users\ralfl\.cache\modelscope\hub\damo\cv_unet_skin_retouching_torch
2023-12-07 10:02:17,758 - modelscope - INFO - initiate model from location C:\Users\ralfl\.cache\modelscope\hub\damo\cv_unet_skin_retouching_torch.
2023-12-07 10:02:17,763 - modelscope - WARNING - No preprocessor field found in cfg.
2023-12-07 10:02:17,763 - modelscope - WARNING - No val key and type key found in preprocessor domain of configuration.json file.
2023-12-07 10:02:17,763 - modelscope - WARNING - Cannot find available config to build preprocessor at mode inference, current config: {'model_dir': 'C:\\Users\\ralfl\\.cache\\modelscope\\hub\\damo\\cv_unet_skin_retouching_torch'}. trying to build by task and model information.
2023-12-07 10:02:17,764 - modelscope - WARNING - Find task: skin-retouching-torch, model type: None. Insufficient information to build preprocessor, skip building preprocessor
2023-12-07 10:02:22,111 - modelscope - WARNING - Model revision not specified, use revision: v2.0.2
2023-12-07 10:02:24,499 - modelscope - INFO - initiate model from C:\Users\ralfl\.cache\modelscope\hub\damo\cv_resnet50_face-detection_retinaface
2023-12-07 10:02:24,499 - modelscope - INFO - initiate model from location C:\Users\ralfl\.cache\modelscope\hub\damo\cv_resnet50_face-detection_retinaface.
2023-12-07 10:02:24,507 - modelscope - WARNING - No preprocessor field found in cfg.
2023-12-07 10:02:24,507 - modelscope - WARNING - No val key and type key found in preprocessor domain of configuration.json file.
2023-12-07 10:02:24,508 - modelscope - WARNING - Cannot find available config to build preprocessor at mode inference, current config: {'model_dir': 'C:\\Users\\ralfl\\.cache\\modelscope\\hub\\damo\\cv_resnet50_face-detection_retinaface'}. trying to build by task and model information.
2023-12-07 10:02:24,510 - modelscope - WARNING - Find task: face-detection, model type: None. Insufficient information to build preprocessor, skip building preprocessor
2023-12-07 10:02:24,522 - modelscope - INFO - loading model from C:\Users\ralfl\.cache\modelscope\hub\damo\cv_resnet50_face-detection_retinaface\pytorch_model.pt
2023-12-07 10:02:24,850 - modelscope - INFO - load model done
2023-12-07 10:02:26.4367882 [W:onnxruntime:, graph.cc:3543 onnxruntime::Graph::CleanUnusedInitializersAndNodeArgs] Removing initializer 'const_fold_opt__2649'. It is not used by any node and should be removed from the model.
2023-12-07 10:02:26.4424361 [W:onnxruntime:, graph.cc:3543 onnxruntime::Graph::CleanUnusedInitializersAndNodeArgs] Removing initializer 'const_fold_opt__2644'. It is not used by any node and should be removed from the model.
2023-12-07 10:02:26.4476586 [W:onnxruntime:, graph.cc:3543 onnxruntime::Graph::CleanUnusedInitializersAndNodeArgs] Removing initializer 'const_fold_opt__2647'. It is not used by any node and should be removed from the model.
2023-12-07 10:02:26.4539665 [W:onnxruntime:, graph.cc:3543 onnxruntime::Graph::CleanUnusedInitializersAndNodeArgs] Removing initializer 'const_fold_opt__2658'. It is not used by any node and should be removed from the model.
2023-12-07 10:02:26.4603276 [W:onnxruntime:, graph.cc:3543 onnxruntime::Graph::CleanUnusedInitializersAndNodeArgs] Removing initializer 'const_fold_opt__2648'. It is not used by any node and should be removed from the model.
2023-12-07 10:02:26.4668787 [W:onnxruntime:, graph.cc:3543 onnxruntime::Graph::CleanUnusedInitializersAndNodeArgs] Removing initializer 'const_fold_opt__2657'. It is not used by any node and should be removed from the model.
2023-12-07 10:02:26.4728854 [W:onnxruntime:, graph.cc:3543 onnxruntime::Graph::CleanUnusedInitializersAndNodeArgs] Removing initializer 'const_fold_opt__2653'. It is not used by any node and should be removed from the model.
2023-12-07 10:02:26.4781193 [W:onnxruntime:, graph.cc:3543 onnxruntime::Graph::CleanUnusedInitializersAndNodeArgs] Removing initializer 'const_fold_opt__2652'. It is not used by any node and should be removed from the model.
2023-12-07 10:02:26.4847790 [W:onnxruntime:, graph.cc:3543 onnxruntime::Graph::CleanUnusedInitializersAndNodeArgs] Removing initializer 'const_fold_opt__2645'. It is not used by any node and should be removed from the model.
2023-12-07 10:02:26.4901577 [W:onnxruntime:, graph.cc:3543 onnxruntime::Graph::CleanUnusedInitializersAndNodeArgs] Removing initializer 'const_fold_opt__2643'. It is not used by any node and should be removed from the model.
2023-12-07 10:02:26.4960393 [W:onnxruntime:, graph.cc:3543 onnxruntime::Graph::CleanUnusedInitializersAndNodeArgs] Removing initializer 'const_fold_opt__2641'. It is not used by any node and should be removed from the model.
2023-12-07 10:02:26.5011255 [W:onnxruntime:, graph.cc:3543 onnxruntime::Graph::CleanUnusedInitializersAndNodeArgs] Removing initializer 'const_fold_opt__2633'. It is not used by any node and should be removed from the model.
2023-12-07 10:02:26.5079757 [W:onnxruntime:, graph.cc:3543 onnxruntime::Graph::CleanUnusedInitializersAndNodeArgs] Removing initializer 'const_fold_opt__2632'. It is not used by any node and should be removed from the model.
2023-12-07 10:02:26.5151724 [W:onnxruntime:, graph.cc:3543 onnxruntime::Graph::CleanUnusedInitializersAndNodeArgs] Removing initializer 'const_fold_opt__2624'. It is not used by any node and should be removed from the model.
2023-12-07 10:02:26.5207215 [W:onnxruntime:, graph.cc:3543 onnxruntime::Graph::CleanUnusedInitializersAndNodeArgs] Removing initializer 'const_fold_opt__2614'. It is not used by any node and should be removed from the model.
2023-12-07 10:02:26.5265524 [W:onnxruntime:, graph.cc:3543 onnxruntime::Graph::CleanUnusedInitializersAndNodeArgs] Removing initializer 'const_fold_opt__2613'. It is not used by any node and should be removed from the model.
2023-12-07 10:02:26.5354438 [W:onnxruntime:, graph.cc:3543 onnxruntime::Graph::CleanUnusedInitializersAndNodeArgs] Removing initializer 'const_fold_opt__2606'. It is not used by any node and should be removed from the model.
2023-12-07 10:02:26.5445020 [W:onnxruntime:, graph.cc:3543 onnxruntime::Graph::CleanUnusedInitializersAndNodeArgs] Removing initializer 'const_fold_opt__2598'. It is not used by any node and should be removed from the model.
2023-12-07 10:02:26.5495662 [W:onnxruntime:, graph.cc:3543 onnxruntime::Graph::CleanUnusedInitializersAndNodeArgs] Removing initializer 'const_fold_opt__2596'. It is not used by any node and should be removed from the model.
2023-12-07 10:02:26.5561368 [W:onnxruntime:, graph.cc:3543 onnxruntime::Graph::CleanUnusedInitializersAndNodeArgs] Removing initializer 'const_fold_opt__2594'. It is not used by any node and should be removed from the model.
2023-12-07 10:02:29,728 - modelscope - INFO - Use user-specified model revision: v1.1
2023-12-07 10:02:30,327 - modelscope - INFO - initiate model from C:\Users\ralfl\.cache\modelscope\hub\damo\cv_ddsar_face-detection_iclr23-damofd
2023-12-07 10:02:30,328 - modelscope - INFO - initiate model from location C:\Users\ralfl\.cache\modelscope\hub\damo\cv_ddsar_face-detection_iclr23-damofd.
2023-12-07 10:02:30,329 - modelscope - INFO - initialize model from C:\Users\ralfl\.cache\modelscope\hub\damo\cv_ddsar_face-detection_iclr23-damofd
2023-12-07 10:02:36,244 - mmcv - INFO - initialize PAFPN with init_cfg {'type': 'Xavier', 'layer': 'Conv2d', 'distribution': 'uniform'}
2023-12-07 10:02:36,245 - mmcv - INFO -
lateral_convs.0.conv.weight - torch.Size([16, 64, 1, 1]):
XavierInit: gain=1, distribution=uniform, bias=0

2023-12-07 10:02:36,246 - mmcv - INFO -
lateral_convs.0.conv.bias - torch.Size([16]):
The value is the same before and after calling `init_weights` of PAFPN

2023-12-07 10:02:36,246 - mmcv - INFO -
lateral_convs.1.conv.weight - torch.Size([16, 120, 1, 1]):
XavierInit: gain=1, distribution=uniform, bias=0

2023-12-07 10:02:36,246 - mmcv - INFO -
lateral_convs.1.conv.bias - torch.Size([16]):
The value is the same before and after calling `init_weights` of PAFPN

2023-12-07 10:02:36,246 - mmcv - INFO -
lateral_convs.2.conv.weight - torch.Size([16, 160, 1, 1]):
XavierInit: gain=1, distribution=uniform, bias=0

2023-12-07 10:02:36,246 - mmcv - INFO -
lateral_convs.2.conv.bias - torch.Size([16]):
The value is the same before and after calling `init_weights` of PAFPN

2023-12-07 10:02:36,247 - mmcv - INFO -
fpn_convs.0.conv.weight - torch.Size([16, 16, 3, 3]):
XavierInit: gain=1, distribution=uniform, bias=0

2023-12-07 10:02:36,247 - mmcv - INFO -
fpn_convs.0.conv.bias - torch.Size([16]):
The value is the same before and after calling `init_weights` of PAFPN

2023-12-07 10:02:36,247 - mmcv - INFO -
fpn_convs.1.conv.weight - torch.Size([16, 16, 3, 3]):
XavierInit: gain=1, distribution=uniform, bias=0

2023-12-07 10:02:36,247 - mmcv - INFO -
fpn_convs.1.conv.bias - torch.Size([16]):
The value is the same before and after calling `init_weights` of PAFPN

2023-12-07 10:02:36,247 - mmcv - INFO -
fpn_convs.2.conv.weight - torch.Size([16, 16, 3, 3]):
XavierInit: gain=1, distribution=uniform, bias=0

2023-12-07 10:02:36,248 - mmcv - INFO -
fpn_convs.2.conv.bias - torch.Size([16]):
The value is the same before and after calling `init_weights` of PAFPN

2023-12-07 10:02:36,248 - mmcv - INFO -
downsample_convs.0.conv.weight - torch.Size([16, 16, 3, 3]):
XavierInit: gain=1, distribution=uniform, bias=0

2023-12-07 10:02:36,248 - mmcv - INFO -
downsample_convs.0.conv.bias - torch.Size([16]):
The value is the same before and after calling `init_weights` of PAFPN

2023-12-07 10:02:36,248 - mmcv - INFO -
downsample_convs.1.conv.weight - torch.Size([16, 16, 3, 3]):
XavierInit: gain=1, distribution=uniform, bias=0

2023-12-07 10:02:36,248 - mmcv - INFO -
downsample_convs.1.conv.bias - torch.Size([16]):
The value is the same before and after calling `init_weights` of PAFPN

2023-12-07 10:02:36,248 - mmcv - INFO -
pafpn_convs.0.conv.weight - torch.Size([16, 16, 3, 3]):
XavierInit: gain=1, distribution=uniform, bias=0

2023-12-07 10:02:36,248 - mmcv - INFO -
pafpn_convs.0.conv.bias - torch.Size([16]):
The value is the same before and after calling `init_weights` of PAFPN

2023-12-07 10:02:36,249 - mmcv - INFO -
pafpn_convs.1.conv.weight - torch.Size([16, 16, 3, 3]):
XavierInit: gain=1, distribution=uniform, bias=0

2023-12-07 10:02:36,249 - mmcv - INFO -
pafpn_convs.1.conv.bias - torch.Size([16]):
The value is the same before and after calling `init_weights` of PAFPN

2023-12-07 10:02:36,249 - modelscope - INFO - loading model from C:\Users\ralfl\.cache\modelscope\hub\damo\cv_ddsar_face-detection_iclr23-damofd\pytorch_model.pt
load checkpoint from local path: C:\Users\ralfl\.cache\modelscope\hub\damo\cv_ddsar_face-detection_iclr23-damofd\pytorch_model.pt
2023-12-07 10:02:36,551 - modelscope - INFO - load model done
2023-12-07 10:02:40,268 - modelscope - INFO - Use user-specified model revision: v1.0.1
2023-12-07 10:02:40,857 - modelscope - INFO - initiate model from C:\Users\ralfl\.cache\modelscope\hub\damo\cv_resnet101_image-multiple-human-parsing
2023-12-07 10:02:40,858 - modelscope - INFO - initiate model from location C:\Users\ralfl\.cache\modelscope\hub\damo\cv_resnet101_image-multiple-human-parsing.
2023-12-07 10:02:40,861 - modelscope - INFO - initialize model from C:\Users\ralfl\.cache\modelscope\hub\damo\cv_resnet101_image-multiple-human-parsing
2023-12-07 10:02:41,172 - modelscope - INFO - loading model from C:\Users\ralfl\.cache\modelscope\hub\damo\cv_resnet101_image-multiple-human-parsing\pytorch_model.pt
2023-12-07 10:02:41,622 - modelscope - INFO - criterion.empty_weight doesn't exist in current model, skip loading.
2023-12-07 10:02:41,656 - modelscope - INFO - load model done
2023-12-07 10:02:41,682 - modelscope - WARNING - No preprocessor field found in cfg.
2023-12-07 10:02:41,682 - modelscope - WARNING - No val key and type key found in preprocessor domain of configuration.json file.
2023-12-07 10:02:41,683 - modelscope - WARNING - Cannot find available config to build preprocessor at mode inference, current config: {'model_dir': 'C:\\Users\\ralfl\\.cache\\modelscope\\hub\\damo\\cv_resnet101_image-multiple-human-parsing'}. trying to build by task and model information.
2023-12-07 10:02:41,683 - modelscope - WARNING - No preprocessor key ('m2fp', 'image-segmentation') found in PREPROCESSOR_MAP, skip building preprocessor.
2023-12-07 10:02:44,344 - modelscope - INFO - Use user-specified model revision: v2.0.2
2023-12-07 10:02:45,987 - modelscope - INFO - initiate model from C:\Users\ralfl\.cache\modelscope\hub\damo\cv_resnet34_face-attribute-recognition_fairface
2023-12-07 10:02:45,987 - modelscope - INFO - initiate model from location C:\Users\ralfl\.cache\modelscope\hub\damo\cv_resnet34_face-attribute-recognition_fairface.
2023-12-07 10:02:45,993 - modelscope - WARNING - No preprocessor field found in cfg.
2023-12-07 10:02:45,993 - modelscope - WARNING - No val key and type key found in preprocessor domain of configuration.json file.
2023-12-07 10:02:45,993 - modelscope - WARNING - Cannot find available config to build preprocessor at mode inference, current config: {'model_dir': 'C:\\Users\\ralfl\\.cache\\modelscope\\hub\\damo\\cv_resnet34_face-attribute-recognition_fairface'}. trying to build by task and model information.
2023-12-07 10:02:45,993 - modelscope - WARNING - Find task: face-attribute-recognition, model type: None. Insufficient information to build preprocessor, skip building preprocessor
2023-12-07 10:02:50,930 - modelscope - WARNING - Model revision not specified, use revision: v1.1
2023-12-07 10:02:51,426 - modelscope - INFO - initiate model from C:\Users\ralfl\.cache\modelscope\hub\damo\cv_ddsar_face-detection_iclr23-damofd
2023-12-07 10:02:51,426 - modelscope - INFO - initiate model from location C:\Users\ralfl\.cache\modelscope\hub\damo\cv_ddsar_face-detection_iclr23-damofd.
2023-12-07 10:02:51,428 - modelscope - INFO - initialize model from C:\Users\ralfl\.cache\modelscope\hub\damo\cv_ddsar_face-detection_iclr23-damofd
2023-12-07 10:02:51,450 - mmcv - INFO - initialize PAFPN with init_cfg {'type': 'Xavier', 'layer': 'Conv2d', 'distribution': 'uniform'}
2023-12-07 10:02:51,451 - mmcv - INFO -
lateral_convs.0.conv.weight - torch.Size([16, 64, 1, 1]):
XavierInit: gain=1, distribution=uniform, bias=0

2023-12-07 10:02:51,451 - mmcv - INFO -
lateral_convs.0.conv.bias - torch.Size([16]):
The value is the same before and after calling `init_weights` of PAFPN

2023-12-07 10:02:51,451 - mmcv - INFO -
lateral_convs.1.conv.weight - torch.Size([16, 120, 1, 1]):
XavierInit: gain=1, distribution=uniform, bias=0

2023-12-07 10:02:51,451 - mmcv - INFO -
lateral_convs.1.conv.bias - torch.Size([16]):
The value is the same before and after calling `init_weights` of PAFPN

2023-12-07 10:02:51,452 - mmcv - INFO -
lateral_convs.2.conv.weight - torch.Size([16, 160, 1, 1]):
XavierInit: gain=1, distribution=uniform, bias=0

2023-12-07 10:02:51,452 - mmcv - INFO -
lateral_convs.2.conv.bias - torch.Size([16]):
The value is the same before and after calling `init_weights` of PAFPN

2023-12-07 10:02:51,452 - mmcv - INFO -
fpn_convs.0.conv.weight - torch.Size([16, 16, 3, 3]):
XavierInit: gain=1, distribution=uniform, bias=0

2023-12-07 10:02:51,453 - mmcv - INFO -
fpn_convs.0.conv.bias - torch.Size([16]):
The value is the same before and after calling `init_weights` of PAFPN

2023-12-07 10:02:51,453 - mmcv - INFO -
fpn_convs.1.conv.weight - torch.Size([16, 16, 3, 3]):
XavierInit: gain=1, distribution=uniform, bias=0

2023-12-07 10:02:51,453 - mmcv - INFO -
fpn_convs.1.conv.bias - torch.Size([16]):
The value is the same before and after calling `init_weights` of PAFPN

2023-12-07 10:02:51,453 - mmcv - INFO -
fpn_convs.2.conv.weight - torch.Size([16, 16, 3, 3]):
XavierInit: gain=1, distribution=uniform, bias=0

2023-12-07 10:02:51,453 - mmcv - INFO -
fpn_convs.2.conv.bias - torch.Size([16]):
The value is the same before and after calling `init_weights` of PAFPN

2023-12-07 10:02:51,453 - mmcv - INFO -
downsample_convs.0.conv.weight - torch.Size([16, 16, 3, 3]):
XavierInit: gain=1, distribution=uniform, bias=0

2023-12-07 10:02:51,453 - mmcv - INFO -
downsample_convs.0.conv.bias - torch.Size([16]):
The value is the same before and after calling `init_weights` of PAFPN

2023-12-07 10:02:51,454 - mmcv - INFO -
downsample_convs.1.conv.weight - torch.Size([16, 16, 3, 3]):
XavierInit: gain=1, distribution=uniform, bias=0

2023-12-07 10:02:51,454 - mmcv - INFO -
downsample_convs.1.conv.bias - torch.Size([16]):
The value is the same before and after calling `init_weights` of PAFPN

2023-12-07 10:02:51,454 - mmcv - INFO -
pafpn_convs.0.conv.weight - torch.Size([16, 16, 3, 3]):
XavierInit: gain=1, distribution=uniform, bias=0

2023-12-07 10:02:51,454 - mmcv - INFO -
pafpn_convs.0.conv.bias - torch.Size([16]):
The value is the same before and after calling `init_weights` of PAFPN

2023-12-07 10:02:51,454 - mmcv - INFO -
pafpn_convs.1.conv.weight - torch.Size([16, 16, 3, 3]):
XavierInit: gain=1, distribution=uniform, bias=0

2023-12-07 10:02:51,454 - mmcv - INFO -
pafpn_convs.1.conv.bias - torch.Size([16]):
The value is the same before and after calling `init_weights` of PAFPN

2023-12-07 10:02:51,454 - modelscope - INFO - loading model from C:\Users\ralfl\.cache\modelscope\hub\damo\cv_ddsar_face-detection_iclr23-damofd\pytorch_model.pt
load checkpoint from local path: C:\Users\ralfl\.cache\modelscope\hub\damo\cv_ddsar_face-detection_iclr23-damofd\pytorch_model.pt
2023-12-07 10:02:51,480 - modelscope - INFO - load model done
2023-12-07 10:02:51,490 - modelscope - INFO - loading model from C:\Users\ralfl\.cache\modelscope\hub\damo\cv_resnet34_face-attribute-recognition_fairface\pytorch_model.pt
2023-12-07 10:02:51,850 - modelscope - INFO - load model done
2023-12-07 10:02:54,369 - modelscope - INFO - Use user-specified model revision: v2.5
2023-12-07 10:02:55,052 - modelscope - INFO - initiate model from C:\Users\ralfl\.cache\modelscope\hub\damo\cv_manual_facial-landmark-confidence_flcm
2023-12-07 10:02:55,053 - modelscope - INFO - initiate model from location C:\Users\ralfl\.cache\modelscope\hub\damo\cv_manual_facial-landmark-confidence_flcm.
2023-12-07 10:02:55,058 - modelscope - WARNING - No preprocessor field found in cfg.
2023-12-07 10:02:55,059 - modelscope - WARNING - No val key and type key found in preprocessor domain of configuration.json file.
2023-12-07 10:02:55,059 - modelscope - WARNING - Cannot find available config to build preprocessor at mode inference, current config: {'model_dir': 'C:\\Users\\ralfl\\.cache\\modelscope\\hub\\damo\\cv_manual_facial-landmark-confidence_flcm'}. trying to build by task and model information.
2023-12-07 10:02:55,059 - modelscope - WARNING - Find task: face-2d-keypoints, model type: None. Insufficient information to build preprocessor, skip building preprocessor
2023-12-07 10:03:00,673 - modelscope - WARNING - Model revision not specified, use revision: v1.1
2023-12-07 10:03:01,258 - modelscope - INFO - initiate model from C:\Users\ralfl\.cache\modelscope\hub\damo\cv_ddsar_face-detection_iclr23-damofd
2023-12-07 10:03:01,258 - modelscope - INFO - initiate model from location C:\Users\ralfl\.cache\modelscope\hub\damo\cv_ddsar_face-detection_iclr23-damofd.
2023-12-07 10:03:01,260 - modelscope - INFO - initialize model from C:\Users\ralfl\.cache\modelscope\hub\damo\cv_ddsar_face-detection_iclr23-damofd
2023-12-07 10:03:01,281 - mmcv - INFO - initialize PAFPN with init_cfg {'type': 'Xavier', 'layer': 'Conv2d', 'distribution': 'uniform'}
2023-12-07 10:03:01,282 - mmcv - INFO -
lateral_convs.0.conv.weight - torch.Size([16, 64, 1, 1]):
XavierInit: gain=1, distribution=uniform, bias=0

2023-12-07 10:03:01,282 - mmcv - INFO -
lateral_convs.0.conv.bias - torch.Size([16]):
The value is the same before and after calling `init_weights` of PAFPN

2023-12-07 10:03:01,282 - mmcv - INFO -
lateral_convs.1.conv.weight - torch.Size([16, 120, 1, 1]):
XavierInit: gain=1, distribution=uniform, bias=0

2023-12-07 10:03:01,283 - mmcv - INFO -
lateral_convs.1.conv.bias - torch.Size([16]):
The value is the same before and after calling `init_weights` of PAFPN

2023-12-07 10:03:01,283 - mmcv - INFO -
lateral_convs.2.conv.weight - torch.Size([16, 160, 1, 1]):
XavierInit: gain=1, distribution=uniform, bias=0

2023-12-07 10:03:01,283 - mmcv - INFO -
lateral_convs.2.conv.bias - torch.Size([16]):
The value is the same before and after calling `init_weights` of PAFPN

2023-12-07 10:03:01,283 - mmcv - INFO -
fpn_convs.0.conv.weight - torch.Size([16, 16, 3, 3]):
XavierInit: gain=1, distribution=uniform, bias=0

2023-12-07 10:03:01,283 - mmcv - INFO -
fpn_convs.0.conv.bias - torch.Size([16]):
The value is the same before and after calling `init_weights` of PAFPN

2023-12-07 10:03:01,283 - mmcv - INFO -
fpn_convs.1.conv.weight - torch.Size([16, 16, 3, 3]):
XavierInit: gain=1, distribution=uniform, bias=0

2023-12-07 10:03:01,283 - mmcv - INFO -
fpn_convs.1.conv.bias - torch.Size([16]):
The value is the same before and after calling `init_weights` of PAFPN

2023-12-07 10:03:01,284 - mmcv - INFO -
fpn_convs.2.conv.weight - torch.Size([16, 16, 3, 3]):
XavierInit: gain=1, distribution=uniform, bias=0

2023-12-07 10:03:01,284 - mmcv - INFO -
fpn_convs.2.conv.bias - torch.Size([16]):
The value is the same before and after calling `init_weights` of PAFPN

2023-12-07 10:03:01,284 - mmcv - INFO -
downsample_convs.0.conv.weight - torch.Size([16, 16, 3, 3]):
XavierInit: gain=1, distribution=uniform, bias=0

2023-12-07 10:03:01,285 - mmcv - INFO -
downsample_convs.0.conv.bias - torch.Size([16]):
The value is the same before and after calling `init_weights` of PAFPN

2023-12-07 10:03:01,285 - mmcv - INFO -
downsample_convs.1.conv.weight - torch.Size([16, 16, 3, 3]):
XavierInit: gain=1, distribution=uniform, bias=0

2023-12-07 10:03:01,285 - mmcv - INFO -
downsample_convs.1.conv.bias - torch.Size([16]):
The value is the same before and after calling `init_weights` of PAFPN

2023-12-07 10:03:01,285 - mmcv - INFO -
pafpn_convs.0.conv.weight - torch.Size([16, 16, 3, 3]):
XavierInit: gain=1, distribution=uniform, bias=0

2023-12-07 10:03:01,285 - mmcv - INFO -
pafpn_convs.0.conv.bias - torch.Size([16]):
The value is the same before and after calling `init_weights` of PAFPN

2023-12-07 10:03:01,286 - mmcv - INFO -
pafpn_convs.1.conv.weight - torch.Size([16, 16, 3, 3]):
XavierInit: gain=1, distribution=uniform, bias=0

2023-12-07 10:03:01,286 - mmcv - INFO -
pafpn_convs.1.conv.bias - torch.Size([16]):
The value is the same before and after calling `init_weights` of PAFPN

2023-12-07 10:03:01,286 - modelscope - INFO - loading model from C:\Users\ralfl\.cache\modelscope\hub\damo\cv_ddsar_face-detection_iclr23-damofd\pytorch_model.pt
load checkpoint from local path: C:\Users\ralfl\.cache\modelscope\hub\damo\cv_ddsar_face-detection_iclr23-damofd\pytorch_model.pt
2023-12-07 10:03:01,377 - modelscope - INFO - load model done
2023-12-07 10:03:01,394 - modelscope - INFO - loading model from C:\Users\ralfl\.cache\modelscope\hub\damo\cv_manual_facial-landmark-confidence_flcm\pytorch_model.pt
2023-12-07 10:03:01,424 - modelscope - INFO - load model done
cathed for image process of 000.jpg
Error: nms_impl: implementation for device cuda:0 not found.

cathed for image process of 001.jpg
Error: nms_impl: implementation for device cuda:0 not found.

cathed for image process of 002.jpg
Error: nms_impl: implementation for device cuda:0 not found.

cathed for image process of 003.jpg
Error: nms_impl: implementation for device cuda:0 not found.

cathed for image process of 004.jpg
Error: nms_impl: implementation for device cuda:0 not found.

cathed for image process of 005.jpg
Error: nms_impl: implementation for device cuda:0 not found.

cathed for image process of 006.jpg
Error: nms_impl: implementation for device cuda:0 not found.

cathed for image process of 007.jpg
Error: nms_impl: implementation for device cuda:0 not found.

cathed for image process of 008.jpg
Error: nms_impl: implementation for device cuda:0 not found.

cathed for image process of 009.jpg
Error: nms_impl: implementation for device cuda:0 not found.

[]
Error: result is empty.
instance_data_dir C:\Users\ralfl\Downloads\ai\stable-diffusion-webui\extensions\facechain\worker_data\qw\training_data\ly261666/cv_portrait_model\person1
Traceback (most recent call last):
  File "C:\Users\ralfl\Downloads\ai\stable-diffusion-webui\extensions\facechain/facechain/train_text_to_image_lora.py", line 31, in <module>
    import datasets
ModuleNotFoundError: No module named 'datasets'
Error executing the command: Command '['python', 'C:\\Users\\ralfl\\Downloads\\ai\\stable-diffusion-webui\\extensions\\facechain/facechain/train_text_to_image_lora.py', '--pretrained_model_name_or_path=ly261666/cv_portrait_model', '--revision=v2.0', '--sub_path=film/film', '--output_dataset_name=C:\\Users\\ralfl\\Downloads\\ai\\stable-diffusion-webui\\extensions\\facechain\\worker_data\\qw\\training_data\\ly261666/cv_portrait_model\\person1', '--caption_column=text', '--resolution=512', '--random_flip', '--train_batch_size=1', '--num_train_epochs=200', '--checkpointing_steps=5000', '--learning_rate=1.5e-04', '--lr_scheduler=cosine', '--lr_warmup_steps=0', '--seed=42', '--output_dir=C:\\Users\\ralfl\\Downloads\\ai\\stable-diffusion-webui\\extensions\\facechain\\worker_data\\qw\\ly261666/cv_portrait_model\\person1', '--lora_r=4', '--lora_alpha=32', '--lora_text_encoder_r=32', '--lora_text_encoder_alpha=32', '--resume_from_checkpoint=fromfacecommon']' returned non-zero exit status 1.
Traceback (most recent call last):
  File "C:\Users\ralfl\Downloads\ai\stable-diffusion-webui\extensions\facechain\app.py", line 147, in train_lora_fn
    subprocess.run(command, check=True)
  File "C:\Program Files\WindowsApps\PythonSoftwareFoundation.Python.3.11_3.11.1776.0_x64__qbz5n2kfra8p0\Lib\subprocess.py", line 571, in run
    raise CalledProcessError(retcode, process.args,
subprocess.CalledProcessError: Command '['python', 'C:\\Users\\ralfl\\Downloads\\ai\\stable-diffusion-webui\\extensions\\facechain/facechain/train_text_to_image_lora.py', '--pretrained_model_name_or_path=ly261666/cv_portrait_model', '--revision=v2.0', '--sub_path=film/film', '--output_dataset_name=C:\\Users\\ralfl\\Downloads\\ai\\stable-diffusion-webui\\extensions\\facechain\\worker_data\\qw\\training_data\\ly261666/cv_portrait_model\\person1', '--caption_column=text', '--resolution=512', '--random_flip', '--train_batch_size=1', '--num_train_epochs=200', '--checkpointing_steps=5000', '--learning_rate=1.5e-04', '--lr_scheduler=cosine', '--lr_warmup_steps=0', '--seed=42', '--output_dir=C:\\Users\\ralfl\\Downloads\\ai\\stable-diffusion-webui\\extensions\\facechain\\worker_data\\qw\\ly261666/cv_portrait_model\\person1', '--lora_r=4', '--lora_alpha=32', '--lora_text_encoder_r=32', '--lora_text_encoder_alpha=32', '--resume_from_checkpoint=fromfacecommon']' returned non-zero exit status 1.

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "C:\Users\ralfl\Downloads\ai\stable-diffusion-webui\venv\Lib\site-packages\gradio\routes.py", line 488, in run_predict
    output = await app.get_blocks().process_api(
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\ralfl\Downloads\ai\stable-diffusion-webui\venv\Lib\site-packages\gradio\blocks.py", line 1431, in process_api
    result = await self.call_function(
             ^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\ralfl\Downloads\ai\stable-diffusion-webui\venv\Lib\site-packages\gradio\blocks.py", line 1103, in call_function
    prediction = await anyio.to_thread.run_sync(
                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\ralfl\Downloads\ai\stable-diffusion-webui\venv\Lib\site-packages\anyio\to_thread.py", line 33, in run_sync
    return await get_asynclib().run_sync_in_worker_thread(
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\ralfl\Downloads\ai\stable-diffusion-webui\venv\Lib\site-packages\anyio\_backends\_asyncio.py", line 877, in run_sync_in_worker_thread
    return await future
           ^^^^^^^^^^^^
  File "C:\Users\ralfl\Downloads\ai\stable-diffusion-webui\venv\Lib\site-packages\anyio\_backends\_asyncio.py", line 807, in run
    result = context.run(func, *args)
             ^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\ralfl\Downloads\ai\stable-diffusion-webui\venv\Lib\site-packages\gradio\utils.py", line 707, in wrapper
    response = f(*args, **kwargs)
               ^^^^^^^^^^^^^^^^^^
  File "C:\Users\ralfl\Downloads\ai\stable-diffusion-webui\extensions\facechain\app.py", line 801, in run
    train_lora_fn(base_model_path=base_model_path,
  File "C:\Users\ralfl\Downloads\ai\stable-diffusion-webui\extensions\facechain\app.py", line 150, in train_lora_fn
    raise gr.Error("训练失败 (Training failed)")
gradio.exceptions.Error: '训练失败 (Training failed)'
wangxingjun778 commented 10 months ago

check available cuda: torch.cuda.is_available()

vivekvp1 commented 10 months ago

I have the same error but when I look at the webui I see this: torch: 2.1.1+cu118 and it the start up I also see: [+] torch version 2.1.1+cu118 installed.

I realize you used: python: py3.8, py3.10 pytorch: torch2.0.0, torch2.0.1 CUDA: 11.7 CUDNN: 8+ OS: Ubuntu 20.04, CentOS 7.9 GPU: Nvidia-A10 24G

In the txt2img tab, I can generate images and I see my gpu go to 100% which makes me believe it is working but not for FaceChain. Is there anyway in FC I can force it to point to the device by changing something in the script?

Thank you very much for all you hard work and efforts, and greatly - for sharing!

sunbaigui commented 5 months ago

please try out the newest train-free, 10s inference version facechain-fact.