openvinotoolkit / openvino

OpenVINO™ is an open-source toolkit for optimizing and deploying AI inference
https://docs.openvino.ai
Apache License 2.0
6.87k stars 2.19k forks source link

[Bug] TBB Warning: Exact exception propagation is requested by application but the linked library is built without support for it #16282

Closed xyangk closed 1 year ago

xyangk commented 1 year ago
System information (version)
Detailed description

I am currently using Go to call the Openvino C API and create several inference requests that I push to a channel in Go. While dealing with client requests, I obtain an idle inference request from the channel and use it to perform inference. After completing the inference, I send it back to the channel.

At the beginning, stress testing is expected. However, after approximately 300 to 1000 requests, a TBB warning appears: TBB Warning: Exact exception propagation is requested by application but the linked library is built without support for it Additionally, the model's output size is unusual and different from what I expect. For instance, the input shape is (36, 37), where 36 is the batch size and 37 is the sequence length. Given that the model's hidden dimension is 146, the output size should be 36*37*146=194,472. Nevertheless, I am getting 138,116.

Steps to reproduce

Load model in C

ov_compiled_model_t *LoadModel(const char *IRModelPath)
{
    ov_core_t *core = NULL;
    ov_model_t *model = NULL;
    ov_partial_shape_t partial_shape;
    ov_compiled_model_t *compiled_model = NULL;

    // core
    CHECK_STATUS(ov_core_create(&core));

    // read model
    ov_core_read_model(core, IRModelPath, NULL, &model);

    // set property
    const char* key = ov_property_key_hint_performance_mode;
    // const char* value="THROUGHPUT";
    const char* value="LATENCY";

    // dynamic shape
    ov_dimension_t ddims[2] = {{1, 150}, {1, 512}};
    ov_partial_shape_create(2, ddims, &partial_shape);
    ov_model_reshape_input_by_name(model, "input_token", partial_shape);
    ov_model_reshape_input_by_name(model, "input_segment", partial_shape);

    // compile model
    ov_core_compile_model(core, model, "CPU", 0, &compiled_model, key, value);

    // free
    ov_partial_shape_free(&partial_shape);
    ov_model_free(model);
    ov_core_free(core);

    return compiled_model;
}

Create infer request channle in Go

func LoadModel(model *FTModel, IRModePath string, num_infer int) {
    model_path := C.CString(IRModePath)
    compiled_model := C.LoadModel(model_path)
    C.free(unsafe.Pointer(model_path))

    for i := 0; i < num_infer; i++ {
        infer_request := OVInferQ{Id: i}
        C.ov_compiled_model_create_infer_request(compiled_model, &infer_request.OV_infer_request_t)
        model.OV_ireq_chan <- infer_request
    }
}

Inference in C

struct ner_tuple_result DoInference(ov_infer_request_t *infer_request, float *token_ids, float *senment_ids, int batch_size, int seq_len)
{
    ov_shape_t input_token_shape;
    ov_shape_t input_segment_shape;
    ov_tensor_t *token_tensor = NULL;
    ov_tensor_t *segment_tensor = NULL;
    ov_tensor_t *ner_dense_output_tensor = NULL;
    ov_tensor_t *tuple_dense_output_tensor = NULL;

    // input data type
    ov_element_type_e input_token_type = F32; // F32
    ov_element_type_e input_segment_type = F32;

    // create input tensor
    const int batch_size_ = batch_size;
    const int seq_len_ = seq_len;
    float ids[batch_size_][seq_len_];
    float segs[batch_size_][seq_len_];
    int i, j;
    for (i = 0; i < batch_size_; i++)
    {
        for (j = 0; j < seq_len_; j++)
        {
            ids[i][j] = token_ids[i * batch_size_ + j];
            segs[i][j] = senment_ids[i * batch_size_ + j];
        }
    };
    int64_t dims[2] = {batch_size_, seq_len_};
    ov_shape_create(2, dims, &input_token_shape);
    ov_shape_create(2, dims, &input_segment_shape);
    ov_tensor_create_from_host_ptr(input_token_type, input_token_shape, ids, &token_tensor);
    ov_tensor_create_from_host_ptr(input_segment_type, input_segment_shape, segs, &segment_tensor);

    // set input tensor to infer request
    ov_infer_request_set_tensor(infer_request, "input_token", token_tensor);
    ov_infer_request_set_tensor(infer_request, "input_segment", segment_tensor);

    // start
    // ov_infer_request_start_async(infer_request);
    // ov_infer_request_wait(infer_request);
    ov_infer_request_infer(infer_request);

    // get output tensor
    ov_infer_request_get_output_tensor_by_index(infer_request, 1, &ner_dense_output_tensor);
    ov_infer_request_get_output_tensor_by_index(infer_request, 2, &tuple_dense_output_tensor);

    // get output data
    void *ner_dense_data = NULL;
    ov_tensor_data(ner_dense_output_tensor, &ner_dense_data);
    void *tuple_dense_data = NULL;
    ov_tensor_data(tuple_dense_output_tensor, &tuple_dense_data);

    float *ner_dense_float_data = (float *)(ner_dense_data);
    float *tuple_dense_float_data = (float *)(tuple_dense_data);

    // get output data size
    size_t ner_dense_size;
    ov_tensor_get_size(ner_dense_output_tensor, &ner_dense_size);
    size_t tuple_dense_size;
    ov_tensor_get_size(tuple_dense_output_tensor, &tuple_dense_size);

    // return struct
    struct ner_tuple_result result;
    result.ner_dense = ner_dense_float_data;
    result.tuple_dense = tuple_dense_float_data;
    result.ner_flat_size = ner_dense_size;
    result.tuple_flat_size = tuple_dense_size;
    if (ner_dense_size != batch_size_ * seq_len_ * 146){
        printf("C bad ner_dense_size %zd, batch_size: %d, seq_len: %d \n", ner_dense_size, batch_size_, seq_len_);
    }

    // free
    ov_tensor_free(ner_dense_output_tensor);
    ov_tensor_free(tuple_dense_output_tensor);
    ov_tensor_free(segment_tensor);
    ov_tensor_free(token_tensor);
    ov_shape_free(&input_token_shape);
    ov_shape_free(&input_segment_shape);

    return result;
}

Inference in Go

func (model *FTModel) OVInference(tokenIds [][]float32, segIds [][]float32) ([][][]float32, [][][]float32) {
    ireq := <-model.OV_ireq_chan
    fmt.Println("idle_ireq_id", ireq.Id)

    batch_size := len(tokenIds)
    seq_len := len(tokenIds[0])
    rlt := DoInference(ireq.OV_infer_request_t, tokenIds, segIds, batch_size, seq_len)
    model.OV_ireq_chan <- ireq
    if len(rlt.ner_mat) != batch_size || len(rlt.ner_mat[0]) != seq_len || len(rlt.ner_mat[0][0]) != 146 {
        fmt.Println(batch_size, seq_len, len(rlt.ner_mat), len(rlt.ner_mat[0]), len(rlt.ner_mat[0][0]))
    }
    return rlt.ner_mat, rlt.tuple_mat
}
Issue submission checklist
riverlijunjie commented 1 year ago

@xyangk Is any following logs for "TBB Warning: Exact exception propagation is requested by application but the linked library is built without support for it" ? It seems there should be more logs after it. About "output size unusual issue", could you try to run "./benchmark_app -m -d CPU -t 1" to check the output info ?

BWT, could you provide pure C sample and model to reproduce this issue?

xyangk commented 1 year ago

$ benchmark_app -m optimized_model/1/optimized_model.xml -d CPU -t 1

[Step 1/11] Parsing and validating input arguments
[ INFO ] Parsing input parameters
[Step 2/11] Loading OpenVINO Runtime
[ INFO ] OpenVINO:
[ INFO ] Build ................................. 2022.3.0-9052-9752fafe8eb-releases/2022/3
[ INFO ]
[ INFO ] Device info:
[ INFO ] CPU
[ INFO ] Build ................................. 2022.3.0-9052-9752fafe8eb-releases/2022/3
[ INFO ]
[ INFO ]
[Step 3/11] Setting device configuration
[ WARNING ] Performance hint was not explicitly specified in command line. Device(CPU) performance hint will be set to THROUGHPUT.
[Step 4/11] Reading model files
[ INFO ] Loading model files
[ INFO ] Read model took 51.84 ms
[ INFO ] Original model I/O parameters:
[ INFO ] Model inputs:
[ INFO ]     Func/StatefulPartitionedCall/input/_1 , input_segment:0 , Func/StatefulPartitionedCall/input/_1:0 , input_segment (node: input_segment) : f32 / [...] / [1,256]
[ INFO ]     Func/StatefulPartitionedCall/input/_0:0 , input_token , Func/StatefulPartitionedCall/input/_0 , input_token:0 (node: input_token) : f32 / [...] / [1,256]
[ INFO ] Model outputs:
[ INFO ]     Identity:0 , StatefulPartitionedCall/model_4/ner_crf/add , StatefulPartitionedCall/Identity , Func/StatefulPartitionedCall/output/_27 , Identity , Func/StatefulPartitionedCall/output/_27:0 , StatefulPartitionedCall/model_4/ner_crf/add:0 , StatefulPartitionedCall/Identity:0 (node: StatefulPartitionedCall/model_4/ner_crf/add) : f32 / [...] / [1,256,146]
[ INFO ]     StatefulPartitionedCall/Identity_1:0 , StatefulPartitionedCall/Identity_1 , Func/StatefulPartitionedCall/output/_28:0 , StatefulPartitionedCall/model_4/ner_dense/BiasAdd , Identity_1 , Func/StatefulPartitionedCall/output/_28 , Identity_1:0 , StatefulPartitionedCall/model_4/ner_dense/BiasAdd:0 (node: StatefulPartitionedCall/model_4/ner_dense/BiasAdd) : f32 / [...] / [1,256,146]
[ INFO ]     StatefulPartitionedCall/model_4/tuple_dense/BiasAdd , Func/StatefulPartitionedCall/output/_29:0 , StatefulPartitionedCall/Identity_2:0 , Identity_2:0 , Identity_2 , StatefulPartitionedCall/Identity_2 , StatefulPartitionedCall/model_4/tuple_dense/BiasAdd:0 , Func/StatefulPartitionedCall/output/_29 (node: StatefulPartitionedCall/model_4/tuple_dense/BiasAdd) : f32 / [...] / [1,256,128]
[Step 5/11] Resizing model to match image sizes and given batch
[ INFO ] Model batch size: 1
[Step 6/11] Configuring input of the model
[ INFO ] Model inputs:
[ INFO ]     Func/StatefulPartitionedCall/input/_1 , input_segment:0 , Func/StatefulPartitionedCall/input/_1:0 , input_segment (node: input_segment) : f32 / [...] / [1,256]
[ INFO ]     Func/StatefulPartitionedCall/input/_0:0 , input_token , Func/StatefulPartitionedCall/input/_0 , input_token:0 (node: input_token) : f32 / [...] / [1,256]
[ INFO ] Model outputs:
[ INFO ]     Identity:0 , StatefulPartitionedCall/model_4/ner_crf/add , StatefulPartitionedCall/Identity , Func/StatefulPartitionedCall/output/_27 , Identity , Func/StatefulPartitionedCall/output/_27:0 , StatefulPartitionedCall/model_4/ner_crf/add:0 , StatefulPartitionedCall/Identity:0 (node: StatefulPartitionedCall/model_4/ner_crf/add) : f32 / [...] / [1,256,146]
[ INFO ]     StatefulPartitionedCall/Identity_1:0 , StatefulPartitionedCall/Identity_1 , Func/StatefulPartitionedCall/output/_28:0 , StatefulPartitionedCall/model_4/ner_dense/BiasAdd , Identity_1 , Func/StatefulPartitionedCall/output/_28 , Identity_1:0 , StatefulPartitionedCall/model_4/ner_dense/BiasAdd:0 (node: StatefulPartitionedCall/model_4/ner_dense/BiasAdd) : f32 / [...] / [1,256,146]
[ INFO ]     StatefulPartitionedCall/model_4/tuple_dense/BiasAdd , Func/StatefulPartitionedCall/output/_29:0 , StatefulPartitionedCall/Identity_2:0 , Identity_2:0 , Identity_2 , StatefulPartitionedCall/Identity_2 , StatefulPartitionedCall/model_4/tuple_dense/BiasAdd:0 , Func/StatefulPartitionedCall/output/_29 (node: StatefulPartitionedCall/model_4/tuple_dense/BiasAdd) : f32 / [...] / [1,256,128]
[Step 7/11] Loading the model to the device
[ INFO ] Compile model took 270.89 ms
[Step 8/11] Querying optimal runtime parameters
[ INFO ] Model:
[ INFO ]   NETWORK_NAME: TensorFlow_Frontend_IR
[ INFO ]   OPTIMAL_NUMBER_OF_INFER_REQUESTS: 4
[ INFO ]   NUM_STREAMS: 4
[ INFO ]   AFFINITY: Affinity.CORE
[ INFO ]   INFERENCE_NUM_THREADS: 8
[ INFO ]   PERF_COUNT: False
[ INFO ]   INFERENCE_PRECISION_HINT: <Type: 'float32'>
[ INFO ]   PERFORMANCE_HINT: PerformanceMode.THROUGHPUT
[ INFO ]   PERFORMANCE_HINT_NUM_REQUESTS: 0
[Step 9/11] Creating infer requests and preparing input tensors
[ WARNING ] No input files were given for input 'Func/StatefulPartitionedCall/input/_1'!. This input will be filled with random values!
[ WARNING ] No input files were given for input 'Func/StatefulPartitionedCall/input/_0'!. This input will be filled with random values!
[ INFO ] Fill input 'Func/StatefulPartitionedCall/input/_1' with random values
[ INFO ] Fill input 'Func/StatefulPartitionedCall/input/_0' with random values
[Step 10/11] Measuring performance (Start inference asynchronously, 4 inference requests, limits: 1000 ms duration)
[ INFO ] Benchmarking in inference only mode (inputs filling are not included in measurement loop).
[ INFO ] First inference took 25.75 ms
[Step 11/11] Dumping statistics report
[ INFO ] Count:            740 iterations
[ INFO ] Duration:         1007.30 ms
[ INFO ] Latency:
[ INFO ]    Median:        5.13 ms
[ INFO ]    Average:       5.32 ms
[ INFO ]    Min:           4.69 ms
[ INFO ]    Max:           11.77 ms
[ INFO ] Throughput:   734.63 FPS
riverlijunjie commented 1 year ago

If the input is [1,256] [1,256], then output will be [1,256,146] [1,256,146] [1,256,128], which is batch = 1

Your case is input: [36, 37] [36, 37], and expected output is [36, 37, 146] = 194472 , [36, 37, 146] , [36, 37, 128]. But actual output is 138116 = 146 * 946 (no sure what shape), could you add below line to get the actual shape:

ov_tensor_get_shape(const ov_tensor_t* tensor, ov_shape_t* shape);

And also try batch =1 case to check the output shape.

xyangk commented 1 year ago

Pure C failed after several steps: $ gcc hello_openvino_t.c -lopenvino_c -o hello_openvino_t -g $ ./hello_openvino_t

---- OpenVINO INFO----
Description : OpenVINO Runtime
Build number: 2022.3.0-9052-9752fafe8eb-releases/2022/3
ov_core_create success, address: 0x898c40
ov_core_read_model success from /data/xiaoyang/learn_go/optimized_model/1/optimized_model.xml, address: 0x898ee0
ov_core_compile_model success, address: 0x9122c0
ner_dense_size 4746168, batch_size: 84, seq_len: 387, right: 1
ner_dense_size 1287136, batch_size: 58, seq_len: 152, right: 1
ner_dense_size 490560, batch_size: 42, seq_len: 80, right: 1
ner_dense_size 308790, batch_size: 47, seq_len: 45, right: 1
ner_dense_size 671892, batch_size: 39, seq_len: 118, right: 1
ner_dense_size 1374736, batch_size: 22, seq_len: 428, right: 1
ner_dense_size 3944336, batch_size: 88, seq_len: 307, right: 1
ner_dense_size 4505268, batch_size: 74, seq_len: 417, right: 1
ner_dense_size 476982, batch_size: 9, seq_len: 363, right: 1
ner_dense_size 3597440, batch_size: 55, seq_len: 448, right: 1
ner_dense_size 1867194, batch_size: 49, seq_len: 261, right: 1
ner_dense_size 31536, batch_size: 36, seq_len: 6, right: 1
ner_dense_size 1651260, batch_size: 39, seq_len: 290, right: 1
ner_dense_size 3202656, batch_size: 48, seq_len: 457, right: 1
ner_dense_size 863298, batch_size: 73, seq_len: 81, right: 1
ner_dense_size 4521328, batch_size: 79, seq_len: 392, right: 1
ner_dense_size 2569600, batch_size: 80, seq_len: 220, right: 1
ner_dense_size 494940, batch_size: 10, seq_len: 339, right: 1
ner_dense_size 3740958, batch_size: 73, seq_len: 351, right: 1
ner_dense_size 1836388, batch_size: 38, seq_len: 331, right: 1
ner_dense_size 4602650, batch_size: 97, seq_len: 325, right: 1
ner_dense_size 432744, batch_size: 57, seq_len: 52, right: 1
ner_dense_size 93002, batch_size: 49, seq_len: 13, right: 1
ner_dense_size 3836442, batch_size: 57, seq_len: 461, right: 1
ner_dense_size 4336200, batch_size: 99, seq_len: 300, right: 1
ner_dense_size 114172, batch_size: 17, seq_len: 46, right: 1
ner_dense_size 583854, batch_size: 31, seq_len: 129, right: 1
ner_dense_size 1557090, batch_size: 27, seq_len: 395, right: 1
ner_dense_size 430700, batch_size: 59, seq_len: 50, right: 1
ner_dense_size 1593152, batch_size: 32, seq_len: 341, right: 1
ner_dense_size 256960, batch_size: 5, seq_len: 352, right: 1
ner_dense_size 1290640, batch_size: 34, seq_len: 260, right: 1
ner_dense_size 1242022, batch_size: 47, seq_len: 181, right: 1
ner_dense_size 2757356, batch_size: 38, seq_len: 497, right: 1
ner_dense_size 1592276, batch_size: 41, seq_len: 266, right: 1
ner_dense_size 263676, batch_size: 42, seq_len: 43, right: 1
ner_dense_size 4118222, batch_size: 67, seq_len: 421, right: 1
ner_dense_size 3951636, batch_size: 78, seq_len: 347, right: 1
ner_dense_size 2933140, batch_size: 82, seq_len: 245, right: 1
ner_dense_size 144540, batch_size: 6, seq_len: 165, right: 1
ner_dense_size 411866, batch_size: 13, seq_len: 217, right: 1
ner_dense_size 3582986, batch_size: 97, seq_len: 253, right: 1
ner_dense_size 250974, batch_size: 9, seq_len: 191, right: 1
ner_dense_size 1306700, batch_size: 25, seq_len: 358, right: 1
ner_dense_size 2854592, batch_size: 47, seq_len: 416, right: 1
ner_dense_size 1854492, batch_size: 58, seq_len: 219, right: 1
ner_dense_size 2417760, batch_size: 80, seq_len: 207, right: 1
ner_dense_size 1459416, batch_size: 28, seq_len: 357, right: 1
ner_dense_size 3642700, batch_size: 50, seq_len: 499, right: 1
ner_dense_size 4088, batch_size: 2, seq_len: 14, right: 1
ner_dense_size 2980444, batch_size: 59, seq_len: 346, right: 1
ner_dense_size 52560, batch_size: 1, seq_len: 360, right: 1
ner_dense_size 3699640, batch_size: 70, seq_len: 362, right: 1
ner_dense_size 6546640, batch_size: 95, seq_len: 472, right: 1
ner_dense_size 4299700, batch_size: 95, seq_len: 310, right: 1
ner_dense_size 659190, batch_size: 21, seq_len: 215, right: 1
ner_dense_size 554800, batch_size: 19, seq_len: 200, right: 1
ner_dense_size 1217932, batch_size: 43, seq_len: 194, right: 1
ner_dense_size 5732398, batch_size: 79, seq_len: 497, right: 1
ner_dense_size 508810, batch_size: 17, seq_len: 205, right: 1
ner_dense_size 284700, batch_size: 39, seq_len: 50, right: 1
ner_dense_size 1467300, batch_size: 30, seq_len: 335, right: 1
ner_dense_size 4298240, batch_size: 92, seq_len: 320, right: 1
ner_dense_size 5360244, batch_size: 87, seq_len: 422, right: 1
ner_dense_size 2045022, batch_size: 29, seq_len: 483, right: 1
Segmentation fault

Here is My toy C file hello_openvino_t.c


// #include <opencv_c_wrapper.h>
#include <stdbool.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include "hello_openvino.h"

struct infer_result
{
    size_t class_id;
    float probability;
};

ov_compiled_model_t *LoadModel(const char *IRModelPath)
{
    ov_core_t *core = NULL;
    ov_model_t *model = NULL;
    ov_partial_shape_t partial_shape;
    ov_compiled_model_t *compiled_model = NULL;

    // core
    ov_core_create(&core);
    printf("ov_core_create success, address: %p\n", core);

    // read model
    ov_core_read_model(core, IRModelPath, NULL, &model);
    printf("ov_core_read_model success from %s, address: %p\n", IRModelPath, model);

    // set property
    const char* key = ov_property_key_hint_performance_mode;
    // const char* value="THROUGHPUT";
    const char* value="LATENCY";

    // dynamic shape
    ov_dimension_t ddims[2] = {{1, 150}, {1, 512}};
    ov_partial_shape_create(2, ddims, &partial_shape);
    ov_model_reshape_input_by_name(model, "input_token", partial_shape);
    ov_model_reshape_input_by_name(model, "input_segment", partial_shape);

    // compile model
    ov_core_compile_model(core, model, "CPU", 0, &compiled_model, key, value);
    printf("ov_core_compile_model success, address: %p\n", compiled_model);

    // free
    ov_partial_shape_free(&partial_shape);
    ov_model_free(model);
    ov_core_free(core);

    return compiled_model;
}

struct ner_tuple_result DoInference(ov_infer_request_t *infer_request, float *token_ids, float *senment_ids, int batch_size, int seq_len)
{
    ov_shape_t input_token_shape;
    ov_shape_t input_segment_shape;
    ov_tensor_t *token_tensor = NULL;
    ov_tensor_t *segment_tensor = NULL;
    ov_tensor_t *ner_dense_output_tensor = NULL;
    ov_tensor_t *tuple_dense_output_tensor = NULL;

    // input data type
    ov_element_type_e input_token_type = F32; // F32
    ov_element_type_e input_segment_type = F32;

    // create input tensor
    const int batch_size_ = batch_size;
    const int seq_len_ = seq_len;
    float ids[batch_size_][seq_len_];
    float segs[batch_size_][seq_len_];
    int i, j;
    for (i = 0; i < batch_size_; i++)
    {
        for (j = 0; j < seq_len_; j++)
        {
            ids[i][j] = token_ids[i * batch_size_ + j];
            segs[i][j] = senment_ids[i * batch_size_ + j];
        }
    };
    int64_t dims[2] = {batch_size_, seq_len_};
    ov_shape_create(2, dims, &input_token_shape);
    ov_shape_create(2, dims, &input_segment_shape);
    ov_tensor_create_from_host_ptr(input_token_type, input_token_shape, ids, &token_tensor);
    ov_tensor_create_from_host_ptr(input_segment_type, input_segment_shape, segs, &segment_tensor);

    // set input tensor to infer request
    ov_infer_request_set_tensor(infer_request, "input_token", token_tensor);
    ov_infer_request_set_tensor(infer_request, "input_segment", segment_tensor);

    // start
    // ov_infer_request_start_async(infer_request);
    // ov_infer_request_wait(infer_request);
    ov_infer_request_infer(infer_request);

    // get output tensor
    ov_infer_request_get_output_tensor_by_index(infer_request, 1, &ner_dense_output_tensor);
    ov_infer_request_get_output_tensor_by_index(infer_request, 2, &tuple_dense_output_tensor);

    // get output data
    void *ner_dense_data = NULL;
    ov_tensor_data(ner_dense_output_tensor, &ner_dense_data);
    void *tuple_dense_data = NULL;
    ov_tensor_data(tuple_dense_output_tensor, &tuple_dense_data);

    float *ner_dense_float_data = (float *)(ner_dense_data);
    float *tuple_dense_float_data = (float *)(tuple_dense_data);

    // get output data size
    size_t ner_dense_size;
    ov_tensor_get_size(ner_dense_output_tensor, &ner_dense_size);
    size_t tuple_dense_size;
    ov_tensor_get_size(tuple_dense_output_tensor, &tuple_dense_size);

    // return struct
    struct ner_tuple_result result;
    result.ner_dense = ner_dense_float_data;
    result.tuple_dense = tuple_dense_float_data;
    result.ner_flat_size = ner_dense_size;
    result.tuple_flat_size = tuple_dense_size;
    printf("ner_dense_size %zd, batch_size: %d, seq_len: %d, right: %d\n", ner_dense_size, batch_size_, seq_len_, ner_dense_size == batch_size_ * seq_len_ * 146);

    // free
    ov_tensor_free(ner_dense_output_tensor);
    ov_tensor_free(tuple_dense_output_tensor);
    ov_tensor_free(segment_tensor);
    ov_tensor_free(token_tensor);
    ov_shape_free(&input_token_shape);
    ov_shape_free(&input_segment_shape);

    return result;
}

void create_data_and_infer(ov_infer_request_t *infer_request, int batch_size, int seq_len){
    srand((unsigned)time( NULL ) );
    float ids[batch_size][seq_len];
    float segs[batch_size][seq_len];
    int i0, j0;
    for (i0 = 0; i0 < batch_size; i0++)
    {
        for (j0 = 0; j0 < seq_len; j0++)
        {
            segs[i0][j0] = 0.0;
        }
    };

    for (i0 = 1; i0 < batch_size; i0++)
    {
        for (j0 = 0; j0 < seq_len; j0++)
        {
            ids[i0][j0] = (float)(rand()%7000);
        }
    };
    DoInference(infer_request, (float *) ids, (float *) segs, batch_size, seq_len);
}

void main()
{
    ov_version_t version;
    ov_get_openvino_version(&version);
    printf("---- OpenVINO INFO----\n");
    printf("Description : %s \n", version.description);
    printf("Build number: %s \n", version.buildNumber);
    ov_version_free(&version);

    // printf("%d\n", OpenVINORun());
    ov_compiled_model_t * compiled_model = LoadModel("/data/xiaoyang/learn_go/optimized_model/1/optimized_model.xml");
    // testRun(compiled_model);

    ov_infer_request_t *infer_request = NULL;
    ov_compiled_model_create_infer_request(compiled_model, &infer_request);

    struct timeval stop, start;
    gettimeofday(&start, NULL);
    //do stuff
    int i0, j0;
    int ct = 1000;
    int batch_size, seq_len;
    for (i0 = 0; i0 < ct; i0++){
        batch_size = rand()%100+1;
        seq_len = rand()%500+1;
        create_data_and_infer(infer_request, batch_size, seq_len);
    }
    gettimeofday(&stop, NULL);
    printf("took %lu ms, count: %d\n", (stop.tv_sec - start.tv_sec) * 1000 + (stop.tv_usec - start.tv_usec)/1000, ct); 

    ov_infer_request_free(infer_request);
}

My model: optimized_model.zip

riverlijunjie commented 1 year ago

Could you also provide header file "hello_openvino.h", so I can reproduce and debug in my machine?

xyangk commented 1 year ago

If the input is [1,256] [1,256], then output will be [1,256,146] [1,256,146] [1,256,128], which is batch = 1

Your case is input: [36, 37] [36, 37], and expected output is [36, 37, 146] = 194472 , [36, 37, 146] , [36, 37, 128]. But actual output is 138116 = 146 * 946 (no sure what shape), could you add below line to get the actual shape:

ov_tensor_get_shape(const ov_tensor_t* tensor, ov_shape_t* shape);

And also try batch =1 case to check the output shape.

Here is true output shape:

TBB Warning: Exact exception propagation is requested by application but the linked library is built without support for it
ner_dense_shape rank: 3, dims: [27, 48, 146]
C bad ner_dense_size 189216, batch_size: 36, seq_len: 37

While input shape is (36, 37)

xyangk commented 1 year ago

Could you also provide header file "hello_openvino.h", so I can reproduce and debug in my machine?

Sure:

#include "openvino/c/openvino.h"
int OpenVINORun();

ov_compiled_model_t *LoadModel(const char* IRModelPath);

struct ner_tuple_result
{
    float *ner_dense;
    float *tuple_dense;
    size_t ner_flat_size;
    size_t tuple_flat_size;
};

void create_data_and_infer(ov_infer_request_t *infer_request, int batch_size, int seq_len);

struct ner_tuple_result DoInference(ov_infer_request_t *infer_request, float *token_ids, float *senment_ids, int batch_size, int seq_len);
xyangk commented 1 year ago

If the input is [1,256] [1,256], then output will be [1,256,146] [1,256,146] [1,256,128], which is batch = 1

Your case is input: [36, 37] [36, 37], and expected output is [36, 37, 146] = 194472 , [36, 37, 146] , [36, 37, 128]. But actual output is 138116 = 146 * 946 (no sure what shape), could you add below line to get the actual shape:

ov_tensor_get_shape(const ov_tensor_t* tensor, ov_shape_t* shape);

And also try batch =1 case to check the output shape.

Batch_size=1 works fine for 5000 inferences

riverlijunjie commented 1 year ago

@xyangk I reproduced the same issue as you, and did a quick debug and found the bug is in the example code you provided, there is out of memory access, please see my fixing:

    for (i = 0; i < batch_size_; i++)
    {
        for (j = 0; j < seq_len_; j++)
        {
            ids[i][j] = token_ids[i * batch_size_ + j];
            segs[i][j] = senment_ids[i * batch_size_ + j];
        }
    };

change to:

    for (i = 0; i < batch_size_; i++)
    {
        for (j = 0; j < seq_len_; j++)
        {
            ids[i][j] = token_ids[i * seq_len_ + j];
            segs[i][j] = senment_ids[i * seq_len_ + j];
        }
    };

After fix above issue, the test example can run successfully:

...
ner_dense_size 821688, batch_size: 42, seq_len: 134, right: 1
ner_dense_size 2705526, batch_size: 71, seq_len: 261, right: 1
ner_dense_size 273312, batch_size: 18, seq_len: 104, right: 1
ner_dense_size 3579336, batch_size: 54, seq_len: 454, right: 1
ner_dense_size 2336, batch_size: 2, seq_len: 8, right: 1
ner_dense_size 3016944, batch_size: 56, seq_len: 369, right: 1
ner_dense_size 2538210, batch_size: 57, seq_len: 305, right: 1
ner_dense_size 296088, batch_size: 12, seq_len: 169, right: 1
ner_dense_size 57816, batch_size: 3, seq_len: 132, right: 1
ner_dense_size 138116, batch_size: 2, seq_len: 473, right: 1
ner_dense_size 277400, batch_size: 5, seq_len: 380, right: 1
ner_dense_size 11826, batch_size: 9, seq_len: 9, right: 1
ner_dense_size 375804, batch_size: 78, seq_len: 33, right: 1
ner_dense_size 1193988, batch_size: 47, seq_len: 174, right: 1
ner_dense_size 2976210, batch_size: 45, seq_len: 453, right: 1
ner_dense_size 462528, batch_size: 22, seq_len: 144, right: 1
ner_dense_size 2182554, batch_size: 99, seq_len: 151, right: 1
ner_dense_size 2420096, batch_size: 64, seq_len: 259, right: 1
ner_dense_size 82928, batch_size: 4, seq_len: 142, right: 1
ner_dense_size 352590, batch_size: 35, seq_len: 69, right: 1
took 296574 ms, count: 1000

SO I think there is no bug in OpenVINO, right?

xyangk commented 1 year ago

@xyangk I reproduced the same issue as you, and did a quick debug and found the bug is in the example code you provided, there is out of memory access, please see my fixing:

    for (i = 0; i < batch_size_; i++)
    {
        for (j = 0; j < seq_len_; j++)
        {
            ids[i][j] = token_ids[i * batch_size_ + j];
            segs[i][j] = senment_ids[i * batch_size_ + j];
        }
    };

change to:

    for (i = 0; i < batch_size_; i++)
    {
        for (j = 0; j < seq_len_; j++)
        {
            ids[i][j] = token_ids[i * seq_len_ + j];
            segs[i][j] = senment_ids[i * seq_len_ + j];
        }
    };

After fix above issue, the test example can run successfully:

...
ner_dense_size 821688, batch_size: 42, seq_len: 134, right: 1
ner_dense_size 2705526, batch_size: 71, seq_len: 261, right: 1
ner_dense_size 273312, batch_size: 18, seq_len: 104, right: 1
ner_dense_size 3579336, batch_size: 54, seq_len: 454, right: 1
ner_dense_size 2336, batch_size: 2, seq_len: 8, right: 1
ner_dense_size 3016944, batch_size: 56, seq_len: 369, right: 1
ner_dense_size 2538210, batch_size: 57, seq_len: 305, right: 1
ner_dense_size 296088, batch_size: 12, seq_len: 169, right: 1
ner_dense_size 57816, batch_size: 3, seq_len: 132, right: 1
ner_dense_size 138116, batch_size: 2, seq_len: 473, right: 1
ner_dense_size 277400, batch_size: 5, seq_len: 380, right: 1
ner_dense_size 11826, batch_size: 9, seq_len: 9, right: 1
ner_dense_size 375804, batch_size: 78, seq_len: 33, right: 1
ner_dense_size 1193988, batch_size: 47, seq_len: 174, right: 1
ner_dense_size 2976210, batch_size: 45, seq_len: 453, right: 1
ner_dense_size 462528, batch_size: 22, seq_len: 144, right: 1
ner_dense_size 2182554, batch_size: 99, seq_len: 151, right: 1
ner_dense_size 2420096, batch_size: 64, seq_len: 259, right: 1
ner_dense_size 82928, batch_size: 4, seq_len: 142, right: 1
ner_dense_size 352590, batch_size: 35, seq_len: 69, right: 1
took 296574 ms, count: 1000

SO I think there is no bug in OpenVINO, right?

Yeah, It's my mistake, thank you so much. It works in C now. But problem still exists in Go (after fix this bug):

TBB Warning: Exact exception propagation is requested by application but the linked library is built without support for it
ner_dense_shape rank: 3, dims: [49, 26, 146]
C bad ner_dense_size 186004, batch_size: 26, seq_len: 48

I will review my code first, here is how I import C in Go (ov_engine actually from hello_openvino):

/*
#cgo CFLAGS: -g -Wall
#cgo CFLAGS: -I${SRCDIR}/openvino_c
#cgo LDFLAGS: -L${SRCDIR}/../../lib/openvino -lopenvino_c -lopenvino -lov_engine -ltbb
#include <stdlib.h>
#include "ov_engine.h"
#include "openvino/c/openvino.h"
*/
xyangk commented 1 year ago

I print shape of ids and token_tensor, which will feed into model:

    // create input tensor
    const int batch_size_ = batch_size;
    const int seq_len_ = seq_len;
    float ids[batch_size_][seq_len_];
    float segs[batch_size_][seq_len_];
    int i, j;
    for (i = 0; i < batch_size_; i++)
    {
        for (j = 0; j < seq_len_; j++)
        {
            ids[i][j] = token_ids[i * seq_len_ + j];
            segs[i][j] = senment_ids[i * seq_len_ + j];
        }
    };
    int64_t dims[2] = {batch_size_, seq_len_};
    ov_shape_create(2, dims, &input_token_shape);
    ov_shape_create(2, dims, &input_segment_shape);
    ov_tensor_create_from_host_ptr(input_token_type, input_token_shape, ids, &token_tensor);
    ov_tensor_create_from_host_ptr(input_segment_type, input_segment_shape, segs, &segment_tensor);

    ov_shape_t token_tensor_shape;
    ov_tensor_get_shape(token_tensor, &token_tensor_shape);

    // set input tensor to infer request
    ov_infer_request_set_tensor(infer_request, "input_token", token_tensor);
    ov_infer_request_set_tensor(infer_request, "input_segment", segment_tensor);

    // start
    // ov_infer_request_start_async(infer_request);
    // ov_infer_request_wait(infer_request);
    ov_infer_request_infer(infer_request);

    // get output tensor
    ov_infer_request_get_output_tensor_by_index(infer_request, 1, &ner_dense_output_tensor);
    ov_infer_request_get_output_tensor_by_index(infer_request, 2, &tuple_dense_output_tensor);

    // get output data
    void *ner_dense_data = NULL;
    ov_tensor_data(ner_dense_output_tensor, &ner_dense_data);
    void *tuple_dense_data = NULL;
    ov_tensor_data(tuple_dense_output_tensor, &tuple_dense_data);

    float *ner_dense_float_data = (float *)(ner_dense_data);
    float *tuple_dense_float_data = (float *)(tuple_dense_data);

    // get output data size
    size_t ner_dense_size;
    ov_shape_t ner_dense_shape;
    ov_tensor_get_size(ner_dense_output_tensor, &ner_dense_size);
    ov_tensor_get_shape(ner_dense_output_tensor, &ner_dense_shape);
    size_t tuple_dense_size;
    ov_tensor_get_size(tuple_dense_output_tensor, &tuple_dense_size);

    // return struct
    struct ner_tuple_result result;
    result.ner_dense = ner_dense_float_data;
    result.tuple_dense = tuple_dense_float_data;
    result.ner_flat_size = ner_dense_size;
    result.tuple_flat_size = tuple_dense_size;
    if (ner_dense_size != batch_size_ * seq_len_ * 146){
        printf("C array shape:  [%ld, %ld] \n", sizeof(ids)/sizeof(ids[0]), sizeof(ids[0])/sizeof(ids[0][0]));
        printf("token_tensor_shape rank: %ld, dims: [%ld, %ld]\n", token_tensor_shape.rank, token_tensor_shape.dims[0], token_tensor_shape.dims[1]);
        printf("ner_dense_shape rank: %ld, dims: [%ld, %ld, %ld]\n", ner_dense_shape.rank, ner_dense_shape.dims[0], ner_dense_shape.dims[1], ner_dense_shape.dims[2]);
        printf("C bad ner_dense_size %zd, parameter batch_size: %d, parameter seq_len: %d \n", ner_dense_size, batch_size_, seq_len_);
    }

But I get:

TBB Warning: Exact exception propagation is requested by application but the linked library is built without support for it
C array shape:  [36, 37]
token_tensor_shape rank: 2, dims: [36, 37]
ner_dense_shape rank: 3, dims: [40, 33, 146]
C bad ner_dense_size 192720, parameter batch_size: 36, parameter seq_len: 37
riverlijunjie commented 1 year ago

The expected output shape should be [36, 37, 146], but actual output shape is [40, 33, 146] The actual output shape looks not like corruption, so I guess that maybe it is the last iteration's data and current loop there should be some error you didn't catch, could you check the return value of below function whether is ov_status_e::OK;

ov_infer_request_get_output_tensor_by_index(infer_request, 1, &ner_dense_output_tensor);
ov_tensor_get_size(ner_dense_output_tensor, &ner_dense_size);
xyangk commented 1 year ago

corruption

I checked thes two function, no error occured, but the input item value is strange (ids),too many zeros:

TBB Warning: Exact exception propagation is requested by application but the linked library is built without support for it
C array shape:  [36, 37]
token_tensor_shape rank: 2, dims: [36, 37]
segment_tensor_shape rank: 2, dims: [36, 37]
ner_dense_shape rank: 3, dims: [10, 92, 146]
C bad ner_dense_size 134320, parameter batch_size: 36, parameter seq_len: 37
101 2552 4518 113 1963 1745 114 155 145 154 134 129 119 126 1330 5101 511 102 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 101 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 511 102 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 101 100 100
100 100 3198 7313 131 123 121 122 129 118 122 121 118 122 123 511 102 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 101 100 100 100 100 4680 1184 3633 1762 3780 4545 4638 4565 4567 1350 4500 5790 131 511 102 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 101 100 100 100 100 3198
7313 131 123 121 122 129 118 122 121 118 122 129 511 102 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 101 100 100 100 100 3198 7313 131 123 121 122 129 118 122 121 118 121 130 511 102 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 101 7755 6574 4541 3351 4568 117 5384 4667
1059 121 119 123 126 220 149 100 159 146 132 102 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 101 7770 2228 7000 6117 4568 117 1166 1650 7003 121 119 121 126 149 100 159 157 146 132 102 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

Actual input in Go was:

[[101 2552 4518 113 1963 1745 114 155 145 154 134 129 119 126 1330 5101 511 102 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0] 
[101 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 511 102 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0]
 [101 100 100 100 100 3198 7313 131 123 121 122 129 118 122 121 118 122 123 511 102 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0] 
[101 100 100 100 100 4680 1184 3633 1762 3780 4545 4638 4565 4567 1350 4500 5790 131 511 102 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0] 
[101 100 100 100 100 3198 7313 131 123 121 122 129 118 122 121 118 122 129 511 102 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0]
[101 100 100 100 100 3198 7313 131 123 121 122 129 118 122 121 118 121 130 511 102 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0]
[101 7755 6574 4541 3351 4568 117 5384 4667 1059 121 119 123 126 220 149 100 159 146 132 102 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0] 
[101 7770 2228 7000 6117 4568 117 1166 1650 7003 121 119 121 126 149 100 159 157 146 132 102 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0] 
[101 100 100 100 100 5041 1399 131 100 100 100 100 100 100 100 100 100 100 100 100 511 102 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0] 
[101 100 100 100 100 5041 1399 131 100 100 100 100 100 100 100 100 100 100 100 100 511 102 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0] 
[101 155 163 160 158 150 167 2519 7346 2595 117 5498 7568 7474 5549 1726 3837 2519 7346 2595 511 102 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0] 
[101 100 100 100 100 5041 1399 131 100 100 100 100 100 100 100 100 100 100 100 100 511 102 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0] 
[101 2714 2595 1495 1644 117 4491 5770 4275 158 160 156 510 2487 1213 3355 3349 7463 158 160 156 511 102 0 0 0 0 0 0 0 0 0 0 0 0 0 0] 
[101 1352 678 5501 6768 2428 1377 1138 2595 3717 5514 117 1352 904 6639 5520 1220 5549 3011 1220 6772 2483 511 102 0 0 0 0 0 0 0 0 0 0 0 0 0] 
[101 5498 3843 7509 4518 2100 1762 117 5498 677 4518 855 754 1381 7219 7755 704 5296 5018 164 5490 7313 511 102 0 0 0 0 0 0 0 0 0 0 0 0 0] 
[101 4868 2562 3926 3251 117 5632 712 860 855 117 7481 2159 3187 2460 2382 117 680 1278 4495 1394 868 511 102 0 0 0 0 0 0 0 0 0 0 0 0 0] 
[101 1352 5511 1461 1429 7509 3926 117 3313 7319 1350 2397 3969 2595 1574 7509 117 3187 5541 5606 3040 3092 7509 132 102 0 0 0 0 0 0 0 0 0 0 0 0] 
[101 1987 1987 2642 3300 7770 6117 1327 117 1415 6371 2157 3184 6890 837 4567 1380 1350 5102 849 4565 4567 1380 511 102 0 0 0 0 0 0 0 0 0 0 0 0] 
[101 3187 5491 5489 5848 5367 117 3187 7474 5549 3289 2476 117 3300 1352 678 5501 6768 2428 1377 1138 2595 3717 5514 511 102 0 0 0 0 0 0 0 0 0 0 0] 
[101 1415 6371 5498 4142 510 5310 3417 510 161 143 160 161 510 4896 3837 2697 1380 1350 2166 1147 2970 6239 1380 511 102 0 0 0 0 0 0 0 0 0 0 0] 
[101 5550 3393 4495 4415 2482 3289 2100 1762 117 3833 1220 3187 1358 7361 117 3187 1327 4578 117 3187 1371 1140 4578 511 102 0 0 0 0 0 0 0 0 0 0 0] 
[101 5541 2444 131 3187 4535 2501 117 6817 1220 2190 4917 117 3187 5541 7755 1327 4578 117 3187 4649 678 3698 5514 511 102 0 0 0 0 0 0 0 0 0 0 0] 
[101 3867 1265 2595 3971 4550 510 5517 7608 5052 6819 3837 4567 117 3821 6612 1046 123 121 155 149 100 159 146 132 102 0 0 0 0 0 0 0 0 0 0 0] 
[101 1355 5509 3633 2382 117 5852 1075 5679 1962 117 6716 7770 122 127 121 1330 5101 117 860 7028 129 122 1062 3165 511 102 0 0 0 0 0 0 0 0 0 0] 
[101 1724 5501 3187 4535 2501 117 1068 5688 3187 5273 5514 1350 1327 4578 117 712 1220 3833 1220 3633 2382 117 3187 3348 4307 2900 6644 117 511 102 0 0 0 0 0 0] 
[101 100 100 100 100 100 100 100 5466 689 131 5466 1447 100 100 100 100 100 100 100 100 100 100 1377 7479 2595 131 1377 7479 511 102 0 0 0 0 0 0] 
[101 1928 7565 131 1912 2501 3187 4535 2501 117 3187 1259 1779 117 3688 1355 1146 2357 1772 1258 117 7565 7755 3187 1327 4578 117 3187 4605 4575 511 102 0 0 0 0 0] 
[101 5455 131 5455 2444 1912 2501 3633 2382 117 1912 5455 6887 3187 1146 3789 4289 117 745 4960 3187 1327 4578 117 1420 1213 5110 3844 3633 2382 511 102 0 0 0 0 0] 
[101 6818 122 3299 865 2642 5442 6117 5131 2971 1169 679 4007 2692 117 4958 5592 1285 5635 130 172 122 121 155 155 157 154 120 154 2340 1381 511 102 0 0 0 0] 
[101 100 100 100 100 100 100 100 3696 3184 131 4007 3184 100 100 100 100 100 100 100 100 100 100 100 4567 1380 1360 6835 782 131 3315 782 511 102 0 0 0] 
[101 4868 3926 510 5125 4868 1377 117 3187 4007 3299 5567 117 3187 4706 4413 4960 1139 510 1138 7379 117 3313 1350 4508 4307 5593 5514 1920 1350 5310 5688 132 102 0 0 0] 
[101 100 100 100 100 100 100 100 2399 7977 131 129 128 2259 100 100 100 100 100 100 100 100 100 100 100 2042 2012 4307 1105 131 2347 2042 511 102 0 0 0] 
[101 100 100 100 100 100 100 100 1139 4495 1765 131 1266 776 2356 100 100 100 100 100 100 100 100 100 7357 6835 782 1068 5143 131 3315 782 511 102 0 0 0] 
[101 1498 131 1498 3187 1041 6117 117 2647 7420 1795 2233 704 117 2793 3425 860 1352 904 3187 5514 1920 117 3187 5555 2595 1146 3789 4289 117 1355 7509 3926 3251 511 102 0] 
[101 3634 1400 2642 5442 5632 6401 4958 5592 6117 5131 7360 5635 126 172 127 155 155 157 154 120 154 3717 2398 510 3313 6226 2526 4664 3844 7623 1400 6117 5131 511 102 0] 
[101 2552 4372 126 129 3613 120 1146 510 2526 7970 117 1392 4480 5606 1420 6402 1277 3313 7319 1350 3325 7509 1350 7583 1912 2552 7509 117 3187 2552 1259 3040 3092 7509 132 102]]

Maybe something wrong when I feed Go data to C, I'm going to debug it.

riverlijunjie commented 1 year ago

Good finding! Look like some memory pointer issue during memory copy, which lead to memory skip and overwritten.

xyangk commented 1 year ago

Good finding! Look like some memory pointer issue during memory copy, which lead to memory skip and overwritten.

Thanks for your time, I will keep this issue open until I fix it.

xyangk commented 1 year ago

@riverlijunjie Sorry to bother you again, but there is still a problem. The reasoning failed and an error code -17 was reported in Go. I checked infer status: CHECK_STATUS(ov_infer_request_infer(infer_request)); and got [ERROR] return status -17, line 124, strange output shape may came from last inference.

TBB Warning: Exact exception propagation is requested by application but the linked library is built without support for it
TBB Warning: Exact exception propagation is requested by application but the linked library is built without support for it
[ERROR] return status -17, line 124
infer_request, address: 0xc75bfa0
C array shape:  [26, 48]
token_tensor_shape rank: 2, dims: [26, 48]
segment_tensor_shape rank: 2, dims: [26, 48]
ner_dense_shape rank: 3, dims: [1, 128, 146]
C bad ner_dense_size 18688, parameter batch_size: 26, parameter seq_len: 48

Line 124:

image

The memory alignment problem has been resolved and the input seems to be fine:

image

-17 means UNKNOW_EXCEPTION, I don't know where to look it up and I can't reproduce it with same input in C.

riverlijunjie commented 1 year ago

@xyangk UNKNOW_EXCEPTION means some unknown exception that not thrown by openvino itself, so maybe need add some debug code to catch and print such exception information. BTW, have you build openvino?

One suggestion: add CHECK_SATUS to all openvino api calling to find the first abnormal point.

xyangk commented 1 year ago

@xyangk UNKNOW_EXCEPTION means some unknown exception that not thrown by openvino itself, so maybe need add some debug code to catch and print such exception information. BTW, have you build openvino?

One suggestion: add CHECK_SATUS to all openvino api calling to find the first abnormal point.

Thanks, I'm working on it, I just download openvino archive file, no building. CHECK_SATUS has added to all openvino api calling, but only inference failed.

riverlijunjie commented 1 year ago

You can apply below patch to print exception information:

diff --git a/src/bindings/c/src/common.h b/src/bindings/c/src/common.h
index dda5513cae..06edcd5a29 100644
--- a/src/bindings/c/src/common.h
+++ b/src/bindings/c/src/common.h
@@ -39,7 +39,8 @@
     CATCH_IE_EXCEPTION(INFER_NOT_STARTED, InferNotStarted)    \
     CATCH_IE_EXCEPTION(NETWORK_NOT_READ, NetworkNotRead)      \
     CATCH_IE_EXCEPTION(INFER_CANCELLED, InferCancelled)       \
-    catch (...) {                                             \
+    catch (const std::exception& ex) {                        \
+        std::cout << "Exception: " << ex.what() << std::endl; \
         return ov_status_e::UNKNOW_EXCEPTION;                 \
     }

If you needed, I can build one engineering-test version for you with this pacth.

xyangk commented 1 year ago

You can apply below patch to print exception information:

diff --git a/src/bindings/c/src/common.h b/src/bindings/c/src/common.h
index dda5513cae..06edcd5a29 100644
--- a/src/bindings/c/src/common.h
+++ b/src/bindings/c/src/common.h
@@ -39,7 +39,8 @@
     CATCH_IE_EXCEPTION(INFER_NOT_STARTED, InferNotStarted)    \
     CATCH_IE_EXCEPTION(NETWORK_NOT_READ, NetworkNotRead)      \
     CATCH_IE_EXCEPTION(INFER_CANCELLED, InferCancelled)       \
-    catch (...) {                                             \
+    catch (const std::exception& ex) {                        \
+        std::cout << "Exception: " << ex.what() << std::endl; \
         return ov_status_e::UNKNOW_EXCEPTION;                 \
     }

If you needed, I can build one engineering-test version for you with this pacth.

Can you help me compile it? The version I compiled myself has problems running, line 103 is ov_infer_request_infer(infer_request):

---- OpenVINO INFO----
Description : OpenVINO Runtime
Build number: 2023.0.0-10168-b2a2266f603
ov_core_create success, address: 0xbe5f30
ov_core_read_model success from /data/xiaoyang/learn_go/optimized_model/1/optimized_model.xml, address: 0xbe61f0
ov_core_compile_model success, address: 0x1006da0
load model success
ov_compiled_model_create_infer_request success, address: 0xbe5f30
TBB Warning: Exact exception propagation is requested by application but the linked library is built without support for it
TBB Warning: Exact exception propagation is requested by application but the linked library is built without support for it
[ERROR] return status -1, line 103
ner_dense_shape rank: 3, dims: [0, 0, 0]
ner_dense_size 0, batch_size: 84, seq_len: 387, right: 0
TBB Warning: Exact exception propagation is requested by application but the linked library is built without support for it
TBB Warning: Exact exception propagation is requested by application but the linked library is built without support for it
[ERROR] return status -1, line 103
ner_dense_shape rank: 3, dims: [0, 0, 0]
ner_dense_size 0, batch_size: 36, seq_len: 37, right: 0

Here is how I built openvino:

cmake -DCMAKE_BUILD_TYPE=Release -DENABLE_INTEL_GPU=OFF -DPYTHON_LIBRARY=/home/xiaoyang/.conda/envs/env/lib/  DENABLE_SYSTEM_TBB=OFF ..
make DESTDIR=./install  install

My environment variables:

$echo $CPATH
/data/xiaoyang/openvino/build/install/usr/local/runtime/include
$echo $LIBRARY_PATH
/data/xiaoyang/openvino/build/install/usr/local/runtime/lib/intel64:/data/xiaoyang/openvino/build/install/usr/local/runtime/3rdparty/tbb/lib:/data/xiaoyang/tf211_lib/lib
$echo $LD_LIBRARY_PATH
opt/rh/devtoolset-8/root/usr/lib64:/opt/rh/devtoolset-8/root/usr/lib:/opt/rh/devtoolset-8/root/usr/lib64/dyninst:/opt/rh/devtoolset-8/root/usr/lib/dyninst:/opt/rh/devtoolset-8/root/usr/lib64:/opt/rh/devtoolset-8/root/usr/lib:/data/xiaoyang/openvino/build/install/usr/local/runtime/lib/intel64:/data/xiaoyang/openvino/build/install/usr/local/runtime/lib/intel64:/data/xiaoyang/openvino/build/install/usr/local/runtime/3rdparty/tbb/lib:/data/xiaoyang/tf211_lib/lib
riverlijunjie commented 1 year ago

ok, I can build for you. And at same time, could you also print exception content in: https://github.com/openvinotoolkit/openvino/blob/6bf2fe11aeb891eb66db37932df281a982f90369/src/bindings/c/src/common.h#L18-L24

xyangk commented 1 year ago

ok, I can build for you. And at same time, could you also print exception content in:

https://github.com/openvinotoolkit/openvino/blob/6bf2fe11aeb891eb66db37932df281a982f90369/src/bindings/c/src/common.h#L18-L24

Sorry, I'm a novice in C language. Could you explain how to use this?

riverlijunjie commented 1 year ago

ok, I can build for you. And at same time, could you also print exception content in: https://github.com/openvinotoolkit/openvino/blob/6bf2fe11aeb891eb66db37932df281a982f90369/src/bindings/c/src/common.h#L18-L24

Sorry, I'm a novice in C language. Could you explain how to use this?

You can modify like this:

#include "openvino/core/except.hpp"
#include "openvino/openvino.hpp"

#define CATCH_IE_EXCEPTION(StatusCode, ExceptionType)         \
    catch (const InferenceEngine::ExceptionType& ex) {        \
        std::cout << "Exception: " << ex.what() << std::endl; \
        return ov_status_e::StatusCode;                       \
    }
#define CATCH_OV_EXCEPTION(StatusCode, ExceptionType)         \
    catch (const ov::ExceptionType& ex) {                     \
        std::cout << "Exception: " << ex.what() << std::endl; \
        return ov_status_e::StatusCode;                       \
    }

#define CATCH_OV_EXCEPTIONS                                   \
    @@ -39,7 +41,8 @@
    CATCH_IE_EXCEPTION(INFER_NOT_STARTED, InferNotStarted)    \
    CATCH_IE_EXCEPTION(NETWORK_NOT_READ, NetworkNotRead)      \
    CATCH_IE_EXCEPTION(INFER_CANCELLED, InferCancelled)       \
    catch (const std::exception& ex) {                        \
        std::cout << "Exception: " << ex.what() << std::endl; \
        return ov_status_e::UNKNOW_EXCEPTION;                 \
    }
riverlijunjie commented 1 year ago

@xyangk any update for this issue? Any problem, let's solve it!

avitial commented 1 year ago

Closing this, I hope previous responses were sufficient to help you proceed. Feel free to reopen and provide additional information or ask any questions related to this topic.