openvinotoolkit / openvino

OpenVINO™ is an open-source toolkit for optimizing and deploying AI inference
https://docs.openvino.ai
Apache License 2.0
7.03k stars 2.21k forks source link

When I execute the inference of BERT, the error is as follows #1246

Closed YongtaoHuang1994 closed 4 years ago

YongtaoHuang1994 commented 4 years ago
root@lanjunc:/home/lanjunc/hyongtao# python run-openvino-mrpc.py 
/usr/lib/python3.6/importlib/_bootstrap.py:219: RuntimeWarning: compiletime version 3.7 of module 'openvino.inference_engine.ie_api' does not match runtime version 3.6
  return f(*args, **kwds)
/home/lanjunc/hyongtao/bert_mrpc/inference_mrpc.xml
/home/lanjunc/hyongtao/bert_mrpc/inference_mrpc.bin
Loading network files:
    /home/lanjunc/hyongtao/bert_mrpc/inference_mrpc.xml
    /home/lanjunc/hyongtao/bert_mrpc/inference_mrpc.bin
run-openvino-mrpc.py:25: DeprecationWarning: Reading network using constructor is deprecated. Please, use IECore.read_network() method instead
  net = IENetwork(model=model_xml, weights=model_bin)
Traceback (most recent call last):
  File "run-openvino-mrpc.py", line 82, in <module>
    res1=test_openvino(length)
  File "run-openvino-mrpc.py", line 60, in test_openvino
    model = VinoModel()
  File "run-openvino-mrpc.py", line 11, in __init__
    self._load_model()
  File "run-openvino-mrpc.py", line 25, in _load_model
    net = IENetwork(model=model_xml, weights=model_bin)
  File "ie_api.pyx", line 1099, in openvino.inference_engine.ie_api.IENetwork.__cinit__
RuntimeError: Check 'indices_et.is_dynamic() || indices_et.is_integral()' failed at /home/jenkins/agent/workspace/private-ci/ie/build-linux-ubuntu16/b/repos/closed-dldt/ngraph/src/ngraph/op/one_hot.cpp:125:
While validating node 'v1::OneHot OneHot_30(Reshape_26[0]:f32{128}, Constant_27[0]:i64{}, Constant_28[0]:f32{}, Constant_29[0]:f32{}) -> (dynamic?)':
Indices must be integral element type.

Segmentation fault (core dumped)
myshevts commented 4 years ago

@jane-intel or @lazarevevgeny is this is somehow re;ated to graph freezing that has been documnted here https://docs.openvinotoolkit.org/latest/_docs_MO_DG_prepare_model_convert_model_tf_specific_Convert_BERT_From_Tensorflow.html#convert_tensorflow_bert_model_to_ir

YongtaoHuang1994 commented 4 years ago

@jane-intel or @lazarevevgeny is this is somehow re;ated to graph freezing that has been documnted here https://docs.openvinotoolkit.org/latest/_docs_MO_DG_prepare_model_convert_model_tf_specific_Convert_BERT_From_Tensorflow.html#convert_tensorflow_bert_model_to_ir

Thank you. I have read the documents. The converting process is OK. The error happens during inference.

lazarevevgeny commented 4 years ago

@YongtaoHuang1994 , did you use the latest version of the ModelOptimizer to generate the model? Also, did you specify the conversion string as it is said in the documentation? Please, pay attention to the data type (i32) specification in the command line parameters: --input Placeholder{i32},Placeholder_1{i32},Placeholder_2{i32}

YongtaoHuang1994 commented 4 years ago

@YongtaoHuang1994 , did you use the latest version of the ModelOptimizer to generate the model? Also, did you specify the conversion string as it is said in the documentation? Please, pay attention to the data type (i32) specification in the command line parameters: --input Placeholder{i32},Placeholder_1{i32},Placeholder_2{i32}

Yes,I did it according to the document. The process of IR generation is OK.

Model Optimizer arguments:
Common parameters:
    - Path to the Input Model:  /opt/intel/openvino_2020.3.194/deployment_tools/model_optimizer/inference_mrpc.pb
    - Path for generated IR:    /opt/intel/openvino_2020.3.194/deployment_tools/model_optimizer/.
    - IR output name:   inference_mrpc
    - Log level:    ERROR
    - Batch:    Not specified, inherited from the model
    - Input layers:     IteratorGetNext:0[1 128],IteratorGetNext:1[1 128],IteratorGetNext:4[1 128]
    - Output layers:    Not specified, inherited from the model
    - Input shapes:     Not specified, inherited from the model
    - Mean values:  Not specified
    - Scale values:     Not specified
    - Scale factor:     Not specified
    - Precision of IR:  FP32
    - Enable fusing:    True
    - Enable grouped convolutions fusing:   True
    - Move mean values to preprocess section:   False
    - Reverse input channels:   False
TensorFlow specific parameters:
    - Input model in text protobuf format:  False
    - Path to model dump for TensorBoard:   None
    - List of shared libraries with TensorFlow custom layers implementation:    None
    - Update the configuration file with input/output node names:   None
    - Use configuration file used to generate the model with Object Detection API:  None
    - Use the config file:  None
Model Optimizer version:    

[ SUCCESS ] Generated IR version 10 model.
[ SUCCESS ] XML file: /opt/intel/openvino_2020.3.194/deployment_tools/model_optimizer/./inference_mrpc.xml
[ SUCCESS ] BIN file: /opt/intel/openvino_2020.3.194/deployment_tools/model_optimizer/./inference_mrpc.bin
[ SUCCESS ] Total execution time: 59.39 seconds. 
[ SUCCESS ] Memory consumed: 2304 MB. 

But inference process caused error.

root@lanjunc:/home/lanjunc/hyongtao# python run-openvino-mrpc.py 
/usr/lib/python3.6/importlib/_bootstrap.py:219: RuntimeWarning: compiletime version 3.7 of module 'openvino.inference_engine.ie_api' does not match runtime version 3.6
  return f(*args, **kwds)
/home/lanjunc/hyongtao/bert_mrpc/inference_mrpc.xml
/home/lanjunc/hyongtao/bert_mrpc/inference_mrpc.bin
Loading network files:
    /home/lanjunc/hyongtao/bert_mrpc/inference_mrpc.xml
    /home/lanjunc/hyongtao/bert_mrpc/inference_mrpc.bin
run-openvino-mrpc.py:25: DeprecationWarning: Reading network using constructor is deprecated. Please, use IECore.read_network() method instead
  net = IENetwork(model=model_xml, weights=model_bin)
Traceback (most recent call last):
  File "run-openvino-mrpc.py", line 82, in <module>
    res1=test_openvino(length)
  File "run-openvino-mrpc.py", line 60, in test_openvino
    model = VinoModel()
  File "run-openvino-mrpc.py", line 11, in __init__
    self._load_model()
  File "run-openvino-mrpc.py", line 25, in _load_model
    net = IENetwork(model=model_xml, weights=model_bin)
  File "ie_api.pyx", line 1099, in openvino.inference_engine.ie_api.IENetwork.__cinit__
RuntimeError: Check 'indices_et.is_dynamic() || indices_et.is_integral()' failed at /home/jenkins/agent/workspace/private-ci/ie/build-linux-ubuntu16/b/repos/closed-dldt/ngraph/src/ngraph/op/one_hot.cpp:125:
While validating node 'v1::OneHot OneHot_30(Reshape_26[0]:f32{128}, Constant_27[0]:i64{}, Constant_28[0]:f32{}, Constant_29[0]:f32{}) -> (dynamic?)':
Indices must be integral element type.

Segmentation fault (core dumped)

The code of "run-openvino-mrpc.py" is as follow:

import os
from openvino.inference_engine import IENetwork, IECore
import sys
import numpy as np
import time

CUR_DIR = os.path.dirname(os.path.abspath(__file__))

class VinoModel:
    def __init__(self):
        self._load_model()

    def _load_model(self):
        model_dir = os.path.join(CUR_DIR, 'bert_model')
        model_xml = os.path.join(model_dir,'inference_graph.xml')
        print(model_xml)
        model_bin = os.path.join(model_dir,'inference_graph.bin')
        print(model_bin)

        device = "CPU"

        ie = IECore()

        print("Loading network files:\n\t{}\n\t{}".format(model_xml, model_bin))
        net = IENetwork(model=model_xml, weights=model_bin)
        #net = IECore.read_network(model=model_xml, weights=model_bin)

        supported_layers = ie.query_network(net, device)
        #print("1==============================================")
        #print("supported_layers")
        #print(supported_layers)
        #print("2==============================================")
        not_supported_layers = [l for l in net.layers.keys() if l not in supported_layers]
        #print("not_supported_layers")
        #print(not_supported_layers)
        #print("3==============================================")
        if len(not_supported_layers) != 0:
            print("Following layers are not supported by the plugin for specified device {}:\n {}".
                        format('cpu', ', '.join(not_supported_layers)))
            print("Please try to specify cpu extensions library path in sample's command line parameters using -l "
                        "or --cpu_extension command line argument")
            sys.exit(1)

        print('input_layers:',net.inputs.keys())
        print('output_layers:',net.outputs)

        print("Loading model to the plugin")
        self.exec_net = ie.load_network(network=net, device_name=device)

    def encode(self, input_ids, input_masks, segment_ids):
        print("Starting inference")
        inputs = {
            'IteratorGetNext/placeholder_out_port_0' : input_ids,
            'IteratorGetNext/placeholder_out_port_1' : input_masks,
            'IteratorGetNext/placeholder_out_port_4' : segment_ids
        }
        return self.exec_net.infer(inputs=inputs)

def test_openvino(length=128):
    model = VinoModel()
    input_ids = np.ones([1,length], dtype=np.int)
    input_masks = np.ones([1,length], dtype=np.int)
    segment_ids = np.zeros([1,length], dtype=np.int)

    model.encode(input_ids, input_masks, segment_ids)

    total_time = []
    for i in range(3):
        t1 = time.time()
        input_ids = np.random.randint(0,20000,[1,length])
        input_masks = np.ones([1,length],dtype=np.int)
        res = model.encode(input_ids, input_masks, segment_ids)
        # print(res)
        cost_time = time.time()-t1
        total_time.append(cost_time)
        print("{} : {}".format(i,cost_time))
    print("avg time:{}".format(sum(total_time)/len(total_time)))
    return sum(total_time)/len(total_time)

if __name__=="__main__":
    length=128
    res1=test_openvino(length)
    print(res1)
lazarevevgeny commented 4 years ago

You are using 2020.3 release of OpenVINO. Could you try to take the sources from the github and compile it and try again? If this doesn't work then could you share the model that you are trying to convert?

YongtaoHuang1994 commented 4 years ago

You are using 2020.3 release of OpenVINO. Could you try to take the sources from the github and compile it and try again? If this doesn't work then could you share the model that you are trying to convert?

Thanks. Please downlaod my pb model here: https://drive.google.com/file/d/1kFPGi4tgV9JQI72df0aQjww0bpBbX-Mi/view?usp=sharing Creating inference_mrpc.pb file is based on https://docs.openvinotoolkit.org/latest/_docs_MO_DG_prepare_model_convert_model_tf_specific_Convert_DeepSpeech_From_Tensorflow.html. If necessary, my training code and ckpt model are also shared to you: https://drive.google.com/drive/folders/1AJWGwDtzNL0hkdiUq98tQunJPLbNmMz7?usp=sharing It's a model about GLUE_MRPC. The inerence's code in openvino is in run-openvino-mrpc.py. https://drive.google.com/file/d/1TPWrACGaAE9bLpTL2I1QWYZWM80bgFXJ/view?usp=sharing Thanks a lot.

lazarevevgeny commented 4 years ago

@YongtaoHuang1994 , please, take the OpenVINO from releases/2020/4 branch from github and compile it. I checked that conversion and inference works fine with that version.

YongtaoHuang1994 commented 4 years ago

@YongtaoHuang1994 , please, take the OpenVINO from releases/2020/4 branch from github and compile it. I checked that conversion and inference works fine with that version.

Thank you. But when I install OpenVINO releases/2020/4, another error occured.

root@lanjunc:/home/lanjunc/hyongtao# source /home/lanjunc/hyongtao/openvino/scripts/setupvars/setupvars.sh
[setupvars.sh] OpenVINO environment initialized
root@lanjunc:/home/lanjunc/hyongtao# python run-openvino-mrpc.py
Traceback (most recent call last):
  File "run-openvino-mrpc.py", line 2, in <module>
    from openvino.inference_engine import IENetwork, IECore
ModuleNotFoundError: No module named 'openvino.inference_engine'
root@lanjunc:/home/lanjunc/hyongtao# 

Please help me, sir.

YongtaoHuang1994 commented 4 years ago

In fact, when I use OpenVINO release/2020/03, your official BERT demo can infer successfully. But my own training model (MRPC) can only be implemented to the IR generation.

ilyachur commented 4 years ago

@akuporos Could you give some advice about the Python API error?

akuporos commented 4 years ago

Hi @YongtaoHuang1994,

Can you give more information about how you built OpenVINO? Let's start from cmake string.

Thanks, Anastasia

YongtaoHuang1994 commented 4 years ago

Thank you all. I have build OpenVINO 2020.04 successfully. I think the runtime now is OK now.

The inference code for BERT‘s pretrained model can operate normally. The process is based on https://docs.openvinotoolkit.org/latest/_docs_MO_DG_prepare_model_convert_model_tf_specific_Convert_DeepSpeech_From_Tensorflow.html. It works well.

root@bdc9e3efb496:/workspace/openvino/inferencefile/openvino/public/bert# python3 run-openvino.py 
/workspace/openvino/inferencefile/openvino/public/bert/bert_model/inference_graph.xml
/workspace/openvino/inferencefile/openvino/public/bert/bert_model/inference_graph.bin
Loading network files:
    /workspace/openvino/inferencefile/openvino/public/bert/bert_model/inference_graph.xml
    /workspace/openvino/inferencefile/openvino/public/bert/bert_model/inference_graph.bin
run-openvino.py:44: DeprecationWarning: 'inputs' property of IENetwork class is deprecated. To access DataPtrs user need to use 'input_data' property of InputInfoPtr objects which can be accessed by 'input_info' property.
  print('input_layers:',net.inputs.keys())
input_layers: dict_keys(['IteratorGetNext/placeholder_out_port_0', 'IteratorGetNext/placeholder_out_port_1', 'IteratorGetNext/placeholder_out_port_4'])
output_layers: {'bert/pooler/dense/Tanh': <openvino.inference_engine.ie_api.DataPtr object at 0x7f916b660fd0>, 'bert/pooler/strided_slice/Split.1': <openvino.inference_engine.ie_api.DataPtr object at 0x7f916b660fb0>}
Loading model to the plugin
Starting inference
Starting inference
0 : 0.1441342830657959
Starting inference
1 : 0.14930963516235352
Starting inference
2 : 0.15104913711547852
avg time:0.14816435178120932
0.14816435178120932

But when I want to inference my own BERT model. The error occurs.

root@bdc9e3efb496:/workspace/openvino/inferencefile/openvino/public/bert# python3 run-openvino-mrpc.py 
/workspace/openvino/inferencefile/openvino/public/bert/bert_mrpc/inference_mrpc.xml
/workspace/openvino/inferencefile/openvino/public/bert/bert_mrpc/inference_mrpc.bin
Loading network files:
    /workspace/openvino/inferencefile/openvino/public/bert/bert_mrpc/inference_mrpc.xml
    /workspace/openvino/inferencefile/openvino/public/bert/bert_mrpc/inference_mrpc.bin
Traceback (most recent call last):
  File "run-openvino-mrpc.py", line 82, in <module>
    res1=test_openvino(length)
  File "run-openvino-mrpc.py", line 60, in test_openvino
    model = VinoModel()
  File "run-openvino-mrpc.py", line 11, in __init__
    self._load_model()
  File "run-openvino-mrpc.py", line 26, in _load_model
    net = ie.read_network(model=model_xml, weights=model_bin)
  File "ie_api.pyx", line 261, in openvino.inference_engine.ie_api.IECore.read_network
  File "ie_api.pyx", line 293, in openvino.inference_engine.ie_api.IECore.read_network
RuntimeError: Check 'indices_et.is_dynamic() || indices_et.is_integral()' failed at /home/jenkins/agent/workspace/private-ci/ie/build-linux-ubuntu16/b/repos/openvino/ngraph/src/ngraph/op/one_hot.cpp:125:
While validating node 'v1::OneHot OneHot_29(Reshape_25[0]:f32{128}, Constant_26[0]:i64{}, Constant_27[0]:f32{}, Constant_28[0]:f32{}) -> (dynamic?)':
Indices must be integral element type.

Could you help me fix the bug? Thanks a lot.

YongtaoHuang1994 commented 4 years ago

The python code "run-openvino-mrpc.py" is as follows:

import os
from openvino.inference_engine import IENetwork, IECore
import sys
import numpy as np
import time

CUR_DIR = os.path.dirname(os.path.abspath(__file__))

class VinoModel:
    def __init__(self):
        self._load_model()

    def _load_model(self):
        model_dir = os.path.join(CUR_DIR, 'bert_mrpc')
        model_xml = os.path.join(model_dir,'inference_mrpc.xml')
        print(model_xml)
        model_bin = os.path.join(model_dir,'inference_mrpc.bin')
        print(model_bin)

        device = "CPU"

        ie = IECore()

        print("Loading network files:\n\t{}\n\t{}".format(model_xml, model_bin))
        #net = IENetwork(model=model_xml, weights=model_bin)
        net = ie.read_network(model=model_xml, weights=model_bin)

        supported_layers = ie.query_network(net, device)
        #print("1==============================================")
        #print("supported_layers")
        #print(supported_layers)
        #print("2==============================================")
        not_supported_layers = [l for l in net.layers.keys() if l not in supported_layers]
        #print("not_supported_layers")
        #print(not_supported_layers)
        #print("3==============================================")
        if len(not_supported_layers) != 0:
            print("Following layers are not supported by the plugin for specified device {}:\n {}".
                        format('cpu', ', '.join(not_supported_layers)))
            print("Please try to specify cpu extensions library path in sample's command line parameters using -l "
                        "or --cpu_extension command line argument")
            sys.exit(1)

        print('input_layers:',net.inputs.keys())
        print('output_layers:',net.outputs)

        print("Loading model to the plugin")
        self.exec_net = ie.load_network(network=net, device_name=device)

    def encode(self, input_ids, input_masks, segment_ids):
        print("Starting inference")
        inputs = {
            'IteratorGetNext/placeholder_out_port_0' : input_ids,
            'IteratorGetNext/placeholder_out_port_1' : input_masks,
            'IteratorGetNext/placeholder_out_port_2' : segment_ids
        }
        return self.exec_net.infer(inputs=inputs)

def test_openvino(length=128):
    model = VinoModel()
    input_ids = np.ones([1,length], dtype=np.int)
    input_masks = np.ones([1,length], dtype=np.int)
    segment_ids = np.zeros([1,length], dtype=np.int)

    model.encode(input_ids, input_masks, segment_ids)

    total_time = []
    for i in range(3):
        t1 = time.time()
        input_ids = np.random.randint(0,20000,[1,length])
        input_masks = np.ones([1,length],dtype=np.int)
        res = model.encode(input_ids, input_masks, segment_ids)
        # print(res)
        cost_time = time.time()-t1
        total_time.append(cost_time)
        print("{} : {}".format(i,cost_time))
    print("avg time:{}".format(sum(total_time)/len(total_time)))
    return sum(total_time)/len(total_time)

if __name__=="__main__":
    length=128
    res1=test_openvino(length)
    print(res1)
YongtaoHuang1994 commented 4 years ago

Thanks. I have solved this problem. When I convert my model to IR format, the command should be like this:

python ./mo_tf.py --input_model inference_mrpc.pb --disable_nhwc_to_nchw --input IteratorGetNext:0{i32},IteratorGetNext:1{i32},IteratorGetNext:4{i32} --input_shape [1,128],[1,128],[1,128]

Previsouly, I loss "{i32}".

lazarevevgeny commented 4 years ago

@YongtaoHuang1994 , great to hear that you have solved the issue! I close the ticket.