openvinotoolkit / model_server

A scalable inference server for models optimized with OpenVINO™
https://docs.openvino.ai/2024/ovms_what_is_openvino_model_server.html
Apache License 2.0
656 stars 206 forks source link

How to fix/handle error: Pipeline execution aborted due to no content from custom node #2625

Closed jiekechoo closed 2 weeks ago

jiekechoo commented 4 weeks ago

Describe the bug Error processing InferRequest: rpc error: code = Aborted desc = Pipeline execution aborted due to no content from custom node

gRPC client https://github.com/openvinotoolkit/model_server/blob/main/client/go/kserve-api/grpc_infer_resnet.go

Logs

model_server  | [2024-08-14 12:08:17.150][143][serving][debug][kfs_grpc_inference_service.cpp:251] Processing gRPC request for model: multiple_face_recognition; version: 1
model_server  | [2024-08-14 12:08:17.150][143][serving][debug][kfs_grpc_inference_service.cpp:290] ModelInfer requested name: multiple_face_recognition, version: 1
model_server  | [2024-08-14 12:08:17.150][143][serving][debug][modelmanager.cpp:1519] Requesting model: multiple_face_recognition; version: 1.
model_server  | [2024-08-14 12:08:17.150][143][serving][debug][kfs_grpc_inference_service.cpp:293] Requested model: multiple_face_recognition does not exist. Searching for pipeline with
that name...
model_server  | [2024-08-14 12:08:17.150][143][serving][debug][pipelinedefinition.cpp:201] Successfully waited for pipeline definition: multiple_face_recognition
model_server  | [2024-08-14 12:08:17.150][143][dag_executor][debug][pipelinedefinition.cpp:253] Creating pipeline: multiple_face_recognition. Adding nodeName: request, modelName:
model_server  | [2024-08-14 12:08:17.150][143][dag_executor][debug][node.cpp:50] Will create node: request with demultiply: NA, gatherFrom: NA.
model_server  | [2024-08-14 12:08:17.150][143][dag_executor][debug][pipelinedefinition.cpp:253] Creating pipeline: multiple_face_recognition. Adding nodeName: face_detection_node, modelName: face_detection_0813
model_server  | [2024-08-14 12:08:17.150][143][dag_executor][debug][node.cpp:50] Will create node: face_detection_node with demultiply: NA, gatherFrom: NA.
model_server  | [2024-08-14 12:08:17.150][143][dag_executor][debug][pipelinedefinition.cpp:253] Creating pipeline: multiple_face_recognition. Adding nodeName: extract_node, modelName:
model_server  | [2024-08-14 12:08:17.150][143][dag_executor][debug][node.cpp:50] Will create node: extract_node with demultiply: dynamic, gatherFrom: NA.
model_server  | [2024-08-14 12:08:17.150][143][dag_executor][debug][pipelinedefinition.cpp:253] Creating pipeline: multiple_face_recognition. Adding nodeName: face_recognition_node, modelName: arc_face
model_server  | [2024-08-14 12:08:17.150][143][dag_executor][debug][node.cpp:50] Will create node: face_recognition_node with demultiply: NA, gatherFrom: NA.
model_server  | [2024-08-14 12:08:17.150][143][dag_executor][debug][pipelinedefinition.cpp:253] Creating pipeline: multiple_face_recognition. Adding nodeName: response, modelName:
model_server  | [2024-08-14 12:08:17.150][143][dag_executor][debug][node.cpp:50] Will create node: response with demultiply: NA, gatherFrom: extract_node.
model_server  | [2024-08-14 12:08:17.150][143][dag_executor][debug][pipelinedefinition.cpp:297] Connecting pipeline: multiple_face_recognition, from: face_recognition_node, to: response
model_server  | [2024-08-14 12:08:17.150][143][dag_executor][debug][pipeline.cpp:48] Connecting from: face_recognition_node, to: response
model_server  | [2024-08-14 12:08:17.150][143][dag_executor][debug][pipeline.cpp:63] Links from:face_recognition_node to:response:
model_server  |         response[face_tokens]=face_recognition_node[tokens]
model_server  |
model_server  | [2024-08-14 12:08:17.150][143][dag_executor][debug][pipelinedefinition.cpp:297] Connecting pipeline: multiple_face_recognition, from: extract_node, to: response
model_server  | [2024-08-14 12:08:17.150][143][dag_executor][debug][pipeline.cpp:48] Connecting from: extract_node, to: response
model_server  | [2024-08-14 12:08:17.150][143][dag_executor][debug][pipeline.cpp:63] Links from:extract_node to:response:
model_server  |         response[face_images]=extract_node[face_images]
model_server  |         response[face_coordinates]=extract_node[face_coordinates]
model_server  |         response[confidence_levels]=extract_node[confidence_levels]
model_server  |
model_server  | [2024-08-14 12:08:17.150][143][dag_executor][debug][pipelinedefinition.cpp:297] Connecting pipeline: multiple_face_recognition, from: extract_node, to: face_recognition_node
model_server  | [2024-08-14 12:08:17.150][143][dag_executor][debug][pipeline.cpp:48] Connecting from: extract_node, to: face_recognition_node
model_server  | [2024-08-14 12:08:17.150][143][dag_executor][debug][pipeline.cpp:63] Links from:extract_node to:face_recognition_node:
model_server  |         face_recognition_node[input.1]=extract_node[face_images]
model_server  |
model_server  | [2024-08-14 12:08:17.150][143][dag_executor][debug][pipelinedefinition.cpp:297] Connecting pipeline: multiple_face_recognition, from: face_detection_node, to: extract_node
model_server  | [2024-08-14 12:08:17.150][143][dag_executor][debug][pipeline.cpp:48] Connecting from: face_detection_node, to: extract_node
model_server  | [2024-08-14 12:08:17.150][143][dag_executor][debug][pipeline.cpp:63] Links from:face_detection_node to:extract_node:
model_server  |         extract_node[detection]=face_detection_node[detection]
model_server  |
model_server  | [2024-08-14 12:08:17.150][143][dag_executor][debug][pipelinedefinition.cpp:297] Connecting pipeline: multiple_face_recognition, from: request, to: extract_node
model_server  | [2024-08-14 12:08:17.150][143][dag_executor][debug][pipeline.cpp:48] Connecting from: request, to: extract_node
model_server  | [2024-08-14 12:08:17.150][143][dag_executor][debug][pipeline.cpp:63] Links from:request to:extract_node:
model_server  |         extract_node[image]=request[input]
model_server  |
model_server  | [2024-08-14 12:08:17.150][143][dag_executor][debug][pipelinedefinition.cpp:297] Connecting pipeline: multiple_face_recognition, from: request, to: face_detection_node
model_server  | [2024-08-14 12:08:17.150][143][dag_executor][debug][pipeline.cpp:48] Connecting from: request, to: face_detection_node
model_server  | [2024-08-14 12:08:17.150][143][dag_executor][debug][pipeline.cpp:63] Links from:request to:face_detection_node:
model_server  |         face_detection_node[input]=request[input]
model_server  |
model_server  | [2024-08-14 12:08:17.150][143][dag_executor][debug][pipeline.cpp:90] Started execution of pipeline: multiple_face_recognition
model_server  | [2024-08-14 12:08:17.150][143][dag_executor][debug][node.cpp:172] Will create new session:  for node: request
model_server  | [2024-08-14 12:08:17.150][143][dag_executor][debug][pipeline.cpp:132] Pipeline: multiple_face_recognition got message that node: request session:  finished.
model_server  | [2024-08-14 12:08:17.150][143][dag_executor][debug][pipeline.cpp:139] Fetching results of pipeline: multiple_face_recognition node: request session:
model_server  | [2024-08-14 12:08:17.150][143][serving][debug][predict_request_validation_utils.cpp:1035] [servable name: multiple_face_recognition version: 1] Validating request containing binary image input: name: input
model_server  | [2024-08-14 12:08:17.150][143][serving][debug][deserialization.hpp:449] Request contains input in native file format: input
model_server  | [2024-08-14 12:08:17.152][143][dag_executor][debug][node.cpp:75] Will remove node: request session:
model_server  | [2024-08-14 12:08:17.152][143][dag_executor][debug][pipeline.cpp:149] setting pipeline: multiple_face_recognition node: request session:  outputs as inputs for node: extract_node
model_server  | [2024-08-14 12:08:17.152][143][dag_executor][debug][node.cpp:91] node: extract_node set inputs from node: request
model_server  | [2024-08-14 12:08:17.152][143][dag_executor][debug][node.cpp:172] Will create new session:  for node: extract_node
model_server  | [2024-08-14 12:08:17.152][143][dag_executor][debug][node.cpp:132] node: extract_node setting required input from node: request, input name: image, dependency output name: input
model_server  | [2024-08-14 12:08:17.152][143][dag_executor][debug][nodeinputhandler.cpp:29] Setting input: image, shardId: 0
model_server  | [2024-08-14 12:08:17.152][143][dag_executor][debug][nodeinputhandler.cpp:58] Remaining dependencies count for input handler decreased from: 2 to: 1
model_server  | [2024-08-14 12:08:17.152][143][dag_executor][debug][pipeline.cpp:149] setting pipeline: multiple_face_recognition node: request session:  outputs as inputs for node: face_detection_node
model_server  | [2024-08-14 12:08:17.152][143][dag_executor][debug][node.cpp:91] node: face_detection_node set inputs from node: request
model_server  | [2024-08-14 12:08:17.152][143][dag_executor][debug][node.cpp:172] Will create new session:  for node: face_detection_node
model_server  | [2024-08-14 12:08:17.152][143][dag_executor][debug][node.cpp:132] node: face_detection_node setting required input from node: request, input name: input, dependency output name: input
model_server  | [2024-08-14 12:08:17.152][143][dag_executor][debug][nodeinputhandler.cpp:29] Setting input: input, shardId: 0
model_server  | [2024-08-14 12:08:17.152][143][dag_executor][debug][nodeinputhandler.cpp:58] Remaining dependencies count for input handler decreased from: 1 to: 0
model_server  | [2024-08-14 12:08:17.152][143][dag_executor][debug][node.cpp:196] Checking readiness of node: extract_node session:
model_server  | [2024-08-14 12:08:17.152][143][dag_executor][debug][nodesession.cpp:55] node: extract_node session:  isReady: false
model_server  | [2024-08-14 12:08:17.152][143][dag_executor][debug][node.cpp:196] Checking readiness of node: face_detection_node session:
model_server  | [2024-08-14 12:08:17.152][143][dag_executor][debug][nodesession.cpp:55] node: face_detection_node session:  isReady: true
model_server  | [2024-08-14 12:08:17.152][143][dag_executor][debug][pipeline.cpp:168] Started execution of pipeline: multiple_face_recognition node: face_detection_node session:
model_server  | [2024-08-14 12:08:17.152][143][serving][debug][modelmanager.cpp:1519] Requesting model: face_detection_0813; version: 0.
model_server  | [2024-08-14 12:08:17.152][143][serving][debug][model.hpp:89] Getting default version for model: face_detection_0813, 1
model_server  | [2024-08-14 12:08:17.152][143][serving][debug][modelinstance.cpp:1054] Model: face_detection_0813, version: 1 already loaded
model_server  | [2024-08-14 12:08:17.152][143][dag_executor][debug][dlnodesession.cpp:280] Setting completion callback for node name: face_detection_node
model_server  | [2024-08-14 12:08:17.152][143][dag_executor][debug][dlnodesession.cpp:290] Starting infer async for node name: face_detection_node
model_server  | [2024-08-14 12:08:17.161][70][dag_executor][debug][dlnodesession.cpp:284] Completion callback received for node name: face_detection_node
model_server  | [2024-08-14 12:08:17.161][143][dag_executor][debug][pipeline.cpp:132] Pipeline: multiple_face_recognition got message that node: face_detection_node session:  finished.
model_server  | [2024-08-14 12:08:17.161][143][dag_executor][debug][pipeline.cpp:139] Fetching results of pipeline: multiple_face_recognition node: face_detection_node session:
model_server  | [2024-08-14 12:08:17.161][143][dag_executor][debug][dl_node.cpp:92] Node: face_detection_node session:  Waiting for infer request to finish
model_server  | [2024-08-14 12:08:17.161][143][dag_executor][debug][dl_node.cpp:104] Node: face_detection_node session:  infer request finished
model_server  | [2024-08-14 12:08:17.161][143][dag_executor][debug][dl_node.cpp:105] Inference processing time for node face_detection_node; model name: face_detection_0813; session:  - 8.285 ms
model_server  | [2024-08-14 12:08:17.161][143][dag_executor][debug][dl_node.cpp:127] Node: face_detection_node session:  Getting tensor from model: face_detection_0813, inferRequestStreamId: , tensorName: output
model_server  | [2024-08-14 12:08:17.161][143][dag_executor][debug][dl_node.cpp:130] Node: face_detection_node session:  Creating copy of tensor from model: face_detection_0813, tensorName: output
model_server  | [2024-08-14 12:08:17.161][143][dag_executor][debug][dl_node.cpp:147] Node: face_detection_node session:  Tensor with name detection has been prepared
model_server  | [2024-08-14 12:08:17.161][143][serving][debug][nodestreamidguard.cpp:42] Returning streamId: 0
model_server  | [2024-08-14 12:08:17.161][143][dag_executor][debug][node.cpp:75] Will remove node: face_detection_node session:
model_server  | [2024-08-14 12:08:17.161][143][dag_executor][debug][pipeline.cpp:149] setting pipeline: multiple_face_recognition node: face_detection_node session:  outputs as inputs for node: extract_node
model_server  | [2024-08-14 12:08:17.161][143][dag_executor][debug][node.cpp:91] node: extract_node set inputs from node: face_detection_node
model_server  | [2024-08-14 12:08:17.161][143][dag_executor][debug][node.cpp:132] node: extract_node setting required input from node: face_detection_node, input name: detection, dependency output name: detection
model_server  | [2024-08-14 12:08:17.161][143][dag_executor][debug][nodeinputhandler.cpp:29] Setting input: detection, shardId: 0
model_server  | [2024-08-14 12:08:17.161][143][dag_executor][debug][nodeinputhandler.cpp:58] Remaining dependencies count for input handler decreased from: 1 to: 0
model_server  | [2024-08-14 12:08:17.161][143][dag_executor][debug][node.cpp:196] Checking readiness of node: extract_node session:
model_server  | [2024-08-14 12:08:17.161][143][dag_executor][debug][nodesession.cpp:55] node: extract_node session:  isReady: true
model_server  | [2024-08-14 12:08:17.161][143][dag_executor][debug][pipeline.cpp:168] Started execution of pipeline: multiple_face_recognition node: extract_node session:
model_server  | Processing input tensor image resolution: [300 x 300]; expected resolution: [300 x 300]
model_server  | [2024-08-14 12:08:17.161][143][dag_executor][debug][customnodesession.cpp:76] Custom node execution processing time for node extract_node; session:  - 0.028 ms
model_server  | [2024-08-14 12:08:17.161][143][dag_executor][debug][pipeline.cpp:132] Pipeline: multiple_face_recognition got message that node: extract_node session:  finished.
model_server  | [2024-08-14 12:08:17.161][143][dag_executor][debug][pipeline.cpp:139] Fetching results of pipeline: multiple_face_recognition node: extract_node session:
model_server  | [2024-08-14 12:08:17.161][143][dag_executor][debug][custom_node.cpp:82] Node: extract_node session:  Getting custom node output tensor with name: images
model_server  | [2024-08-14 12:08:17.161][143][dag_executor][debug][custom_node.cpp:94] Node: extract_node session:  Tensor with name images has been prepared under alias face_images
model_server  | [2024-08-14 12:08:17.161][143][dag_executor][debug][custom_node.cpp:82] Node: extract_node session:  Getting custom node output tensor with name: coordinates
model_server  | [2024-08-14 12:08:17.161][143][dag_executor][debug][custom_node.cpp:94] Node: extract_node session:  Tensor with name coordinates has been prepared under alias face_coordinates
model_server  | [2024-08-14 12:08:17.161][143][dag_executor][debug][custom_node.cpp:82] Node: extract_node session:  Getting custom node output tensor with name: confidences
model_server  | [2024-08-14 12:08:17.161][143][dag_executor][debug][custom_node.cpp:94] Node: extract_node session:  Tensor with name confidences has been prepared under alias confidence_levels
model_server  | [2024-08-14 12:08:17.161][143][dag_executor][debug][node.cpp:72] Will demultiply node: extract_node outputs with demultiplyCount: dynamic
model_server  | [2024-08-14 12:08:17.161][143][dag_executor][debug][node.cpp:218] Will demultiply node: extract_node outputs to: 0 shards
model_server  | [2024-08-14 12:08:17.161][143][dag_executor][debug][node.cpp:245] Node: extract_node has no results. Dynamic demultiplexer with demultiply == 0 is not supported yet.
model_server  | [2024-08-14 12:08:17.161][143][dag_executor][debug][node.cpp:75] Will remove node: extract_node session:
model_server  | [2024-08-14 12:08:17.161][143][dag_executor][warning][pipeline.cpp:141] Executing pipeline: multiple_face_recognition node: extract_node session:  failed with ret code: 177, error message: Pipeline execution aborted due to no content from custom node

Configuration

  1. OVMS version: 2024.0.74d2a7cec

  2. OVMS config.json file

    {
    "model_config_list": [
    {
      "config": {
        "name": "face_detection_0813",
        "base_path": "/models/face-detection-xdl-0813/",
        "layout": "NHWC:NCHW",
        "target_device": "GPU"
      }
    },
    {
      "config": {
        "name": "arc_face",
        "base_path": "/models/arc_face/",
        "layout": "NHWC:NCHW",
        "target_device": "GPU"
      }
    }
    ],
    "custom_node_library_config_list": [
    {
      "name": "object_detection_image_extractor",
      "base_path": "/ovms/lib/custom_nodes/libcustom_node_model_zoo_intel_object_detection.so"
    }
    ],
    "pipeline_config_list": [
    {
      "name": "multiple_face_recognition",
      "inputs": [
        "input"
      ],
      "nodes": [
        {
          "name": "face_detection_node",
          "model_name": "face_detection_0813",
          "type": "DL model",
          "inputs": [
            {
              "input": {
                "node_name": "request",
                "data_item": "input"
              }
            }
          ],
          "outputs": [
            {
              "data_item": "output",
              "alias": "detection"
            }
          ]
        },
        {
          "name": "extract_node",
          "library_name": "object_detection_image_extractor",
          "type": "custom",
          "demultiply_count": -1,
          "params": {
            "original_image_width": "300",
            "original_image_height": "300",
            "target_image_width": "112",
            "target_image_height": "112",
            "original_image_layout": "NHWC",
            "target_image_layout": "NHWC",
            "convert_to_gray_scale": "false",
            "max_output_batch": "100",
            "confidence_threshold": "0.3",
            "debug": "true",
            "buffer_queue_size": "24"
          },
          "inputs": [
            {
              "image": {
                "node_name": "request",
                "data_item": "input"
              }
            },
            {
              "detection": {
                "node_name": "face_detection_node",
                "data_item": "detection"
              }
            }
          ],
          "outputs": [
            {
              "data_item": "images",
              "alias": "face_images"
            },
            {
              "data_item": "coordinates",
              "alias": "face_coordinates"
            },
            {
              "data_item": "confidences",
              "alias": "confidence_levels"
            }
          ]
        },
        {
          "name": "face_recognition_node",
          "model_name": "arc_face",
          "type": "DL model",
          "inputs": [
            {
              "input.1": {
                "node_name": "extract_node",
                "data_item": "face_images"
              }
            }
          ],
          "outputs": [
            {
              "data_item": "338",
              "alias": "tokens"
            }
          ]
        }
      ],
      "outputs": [
        {
          "face_images": {
            "node_name": "extract_node",
            "data_item": "face_images"
          }
        },
        {
          "face_coordinates": {
            "node_name": "extract_node",
            "data_item": "face_coordinates"
          }
        },
        {
          "confidence_levels": {
            "node_name": "extract_node",
            "data_item": "confidence_levels"
          }
        },
        {
          "face_tokens": {
            "node_name": "face_recognition_node",
            "data_item": "tokens"
          }
        }
      ]
    }
    ]
    }
  3. CPU, accelerator's versions if applicable: Intel N100

  4. Model metadata:

    {
    "modelSpec": {
    "name": "multiple_face_recognition",
    "signatureName": "",
    "version": "1"
    },
    "metadata": {
    "signature_def": {
    "@type": "type.googleapis.com/tensorflow.serving.SignatureDefMap",
    "signatureDef": {
    "serving_default": {
     "inputs": {
      "input": {
       "dtype": "DT_FLOAT",
       "tensorShape": {
        "dim": [
         {
          "size": "1",
          "name": ""
         },
         {
          "size": "300",
          "name": ""
         },
         {
          "size": "300",
          "name": ""
         },
         {
          "size": "3",
          "name": ""
         }
        ],
        "unknownRank": false
       },
       "name": "input"
      }
     },
     "outputs": {
      "face_tokens": {
       "dtype": "DT_FLOAT",
       "tensorShape": {
        "dim": [
         {
          "size": "-1",
          "name": ""
         },
         {
          "size": "1",
          "name": ""
         },
         {
          "size": "512",
          "name": ""
         }
        ],
        "unknownRank": false
       },
       "name": "face_tokens"
      },
      "confidence_levels": {
       "dtype": "DT_FLOAT",
       "tensorShape": {
        "dim": [
         {
          "size": "-1",
          "name": ""
         },
         {
          "size": "1",
          "name": ""
         },
         {
          "size": "1",
          "name": ""
         }
        ],
        "unknownRank": false
       },
       "name": "confidence_levels"
      },
      "face_images": {
       "dtype": "DT_FLOAT",
       "tensorShape": {
        "dim": [
         {
          "size": "-1",
          "name": ""
         },
         {
          "size": "1",
          "name": ""
         },
         {
          "size": "112",
          "name": ""
         },
         {
          "size": "112",
          "name": ""
         },
         {
          "size": "3",
          "name": ""
         }
        ],
        "unknownRank": false
       },
       "name": "face_images"
      },
      "face_coordinates": {
       "dtype": "DT_FLOAT",
       "tensorShape": {
        "dim": [
         {
          "size": "-1",
          "name": ""
         },
         {
          "size": "1",
          "name": ""
         },
         {
          "size": "4",
          "name": ""
         }
        ],
        "unknownRank": false
       },
       "name": "face_coordinates"
      }
     },
     "methodName": "",
     "defaults": {}
    }
    }
    }
    }
    }
atobiszei commented 3 weeks ago

Hi @jiekechoo This is expected behavior. https://github.com/openvinotoolkit/model_server/blob/main/docs/demultiplexing.md#dynamic-demultiply_count-parameter While there were plans to return in empty tensor like [0,3,224,224] due to other priorities those never materialized. Is that blocking your current workflow?

jiekechoo commented 3 weeks ago

Yes,it's blocked. So, I have fixed my inference code call, skip the code 'log.Fatalf' if the server returns ABORTED status.