using docker to deploy but raise the killed no error response

DongZhaoXiong commented 4 months ago

Environment

Docker

Reproduce step

docker pull registry.cn-hangzhou.aliyuncs.com/modelscope-repo/modelscope:ubuntu20.04-cuda11.7.1-py38-torch2.0.1-tf1.15.5-1.8.1
docker run -it --name facechain -p 7860:7860 --gpus all registry.cn-hangzhou.aliyuncs.com/modelscope-repo/modelscope:ubuntu20.04-cuda11.7.1-py38-torch2.0.1-tf1.15.5-1.8.1 /bin/bash
pip3 install gradio==3.47.1 controlnet_aux==0.0.6 python-slugify pip3 install controlnet_aux==0.0.6 pip3 install python-slugify
GIT_LFS_SKIP_SMUDGE=1 git clone https://github.com/modelscope/facechain.git --depth 1 cd facechain python3 app.py

Issue

After Download model artifact, raise those messages:

2024-07-05 22:38:26,664 - modelscope - INFO - Use user-specified model revision: v2.0.2 2024-07-05 22:38:26,920 - modelscope - INFO - initiate model from /mnt/workspace/.cache/modelscope/damo/cv_resnet34_face-attribute-recognition_fairface 2024-07-05 22:38:26,920 - modelscope - INFO - initiate model from location /mnt/workspace/.cache/modelscope/damo/cv_resnet34_face-attribute-recognition_fairface. 2024-07-05 22:38:26,923 - modelscope - WARNING - No preprocessor field found in cfg. 2024-07-05 22:38:26,923 - modelscope - WARNING - No val key and type key found in preprocessor domain of configuration.json file. 2024-07-05 22:38:26,923 - modelscope - WARNING - Cannot find available config to build preprocessor at mode inference, current config: {'model_dir': '/mnt/workspace/.cache/modelscope/damo/cv_resnet34_face-attribute-recognition_fairface'}. trying to build by task and model information. 2024-07-05 22:38:26,923 - modelscope - WARNING - Find task: face-attribute-recognition, model type: None. Insufficient information to build preprocessor, skip building preprocessor 2024-07-05 22:38:27,206 - modelscope - INFO - Model revision not specified, use the latest revision: v1.1 2024-07-05 22:38:27,456 - modelscope - INFO - initiate model from /mnt/workspace/.cache/modelscope/damo/cv_ddsar_face-detection_iclr23-damofd 2024-07-05 22:38:27,456 - modelscope - INFO - initiate model from location /mnt/workspace/.cache/modelscope/damo/cv_ddsar_face-detection_iclr23-damofd. 2024-07-05 22:38:27,457 - modelscope - INFO - initialize model from /mnt/workspace/.cache/modelscope/damo/cv_ddsar_face-detection_iclr23-damofd 2024-07-05 22:38:27,519 - mmcv - INFO - initialize PAFPN with init_cfg {'type': 'Xavier', 'layer': 'Conv2d', 'distribution': 'uniform'} 2024-07-05 22:38:27,522 - mmcv - INFO - lateral_convs.0.conv.weight - torch.Size([16, 64, 1, 1]): XavierInit: gain=1, distribution=uniform, bias=0

2024-07-05 22:38:27,523 - mmcv - INFO - lateral_convs.0.conv.bias - torch.Size([16]): The value is the same before and after calling init_weights of PAFPN

2024-07-05 22:38:27,523 - mmcv - INFO - lateral_convs.1.conv.weight - torch.Size([16, 120, 1, 1]): XavierInit: gain=1, distribution=uniform, bias=0

2024-07-05 22:38:27,523 - mmcv - INFO - lateral_convs.1.conv.bias - torch.Size([16]): The value is the same before and after calling init_weights of PAFPN

2024-07-05 22:38:27,523 - mmcv - INFO - lateral_convs.2.conv.weight - torch.Size([16, 160, 1, 1]): XavierInit: gain=1, distribution=uniform, bias=0

2024-07-05 22:38:27,523 - mmcv - INFO - lateral_convs.2.conv.bias - torch.Size([16]): The value is the same before and after calling init_weights of PAFPN

2024-07-05 22:38:27,523 - mmcv - INFO - fpn_convs.0.conv.weight - torch.Size([16, 16, 3, 3]): XavierInit: gain=1, distribution=uniform, bias=0

2024-07-05 22:38:27,523 - mmcv - INFO - fpn_convs.0.conv.bias - torch.Size([16]): The value is the same before and after calling init_weights of PAFPN

2024-07-05 22:38:27,523 - mmcv - INFO - fpn_convs.1.conv.weight - torch.Size([16, 16, 3, 3]): XavierInit: gain=1, distribution=uniform, bias=0

2024-07-05 22:38:27,523 - mmcv - INFO - fpn_convs.1.conv.bias - torch.Size([16]): The value is the same before and after calling init_weights of PAFPN

2024-07-05 22:38:27,523 - mmcv - INFO - fpn_convs.2.conv.weight - torch.Size([16, 16, 3, 3]): XavierInit: gain=1, distribution=uniform, bias=0

2024-07-05 22:38:27,523 - mmcv - INFO - fpn_convs.2.conv.bias - torch.Size([16]): The value is the same before and after calling init_weights of PAFPN

2024-07-05 22:38:27,523 - mmcv - INFO - downsample_convs.0.conv.weight - torch.Size([16, 16, 3, 3]): XavierInit: gain=1, distribution=uniform, bias=0

2024-07-05 22:38:27,523 - mmcv - INFO - downsample_convs.0.conv.bias - torch.Size([16]): The value is the same before and after calling init_weights of PAFPN

2024-07-05 22:38:27,523 - mmcv - INFO - downsample_convs.1.conv.weight - torch.Size([16, 16, 3, 3]): XavierInit: gain=1, distribution=uniform, bias=0

2024-07-05 22:38:27,524 - mmcv - INFO - downsample_convs.1.conv.bias - torch.Size([16]): The value is the same before and after calling init_weights of PAFPN

2024-07-05 22:38:27,524 - mmcv - INFO - pafpn_convs.0.conv.weight - torch.Size([16, 16, 3, 3]): XavierInit: gain=1, distribution=uniform, bias=0

2024-07-05 22:38:27,524 - mmcv - INFO - pafpn_convs.0.conv.bias - torch.Size([16]): The value is the same before and after calling init_weights of PAFPN

2024-07-05 22:38:27,524 - mmcv - INFO - pafpn_convs.1.conv.weight - torch.Size([16, 16, 3, 3]): XavierInit: gain=1, distribution=uniform, bias=0

2024-07-05 22:38:27,524 - mmcv - INFO - pafpn_convs.1.conv.bias - torch.Size([16]): The value is the same before and after calling init_weights of PAFPN

2024-07-05 22:38:27,524 - modelscope - INFO - loading model from /mnt/workspace/.cache/modelscope/damo/cv_ddsar_face-detection_iclr23-damofd/pytorch_model.pt load checkpoint from local path: /mnt/workspace/.cache/modelscope/damo/cv_ddsar_face-detection_iclr23-damofd/pytorch_model.pt 2024-07-05 22:38:27,568 - modelscope - INFO - load model done 2024-07-05 22:38:27,575 - modelscope - INFO - loading model from /mnt/workspace/.cache/modelscope/damo/cv_resnet34_face-attribute-recognition_fairface/pytorch_model.pt 2024-07-05 22:38:28,189 - modelscope - INFO - load model done 2024-07-05 22:38:28,370 - modelscope - INFO - Use user-specified model revision: v1.0.0 已杀死 root@5c6a779cfac8:/facechain#

You-Cun commented 4 months ago

It may comes from CPU out of memory (OOM) error. Please check if the CPU memory is above 20GB.

DongZhaoXiong commented 4 months ago

It may comes from CPU out of memory (OOM) error. Please check if the CPU memory is above 20GB.

CPU memory？not the GPU memory? I use the AWS g5.xlarge with NVIDIA A10, GPU memory is 24GB

ucasiggcas commented 3 months ago

我的也有上面的日志，但我也是GPU，v100，还有gradio的问题

modelscope / facechain