openvinotoolkit / openvino

OpenVINO™ is an open-source toolkit for optimizing and deploying AI inference
https://docs.openvino.ai
Apache License 2.0
6.78k stars 2.17k forks source link

[Bug]different inference result everytime #12966

Closed Haiboku233 closed 11 months ago

Haiboku233 commented 1 year ago
System information (version)
Detailed description

Hello, I converted a PyTorch model to ONNX and it works fine. But when I tried to load it with OpenVINO ,the inference results are different everytime with the same input. This may be cause by the scatter_add op of PyTorch, which is supported in Opset 16

Steps to reproduce

the PyTorch model includes a module as:

    row, col, value = trans[0].to(x.device), trans[1].to(x.device), trans[2].to(x.device) 
    value = value.unsqueeze(-1).requires_grad_(False)
    out = torch.index_select(x, dim, col) 
    out = out * value
    out2 = torch.zeros(x.size(0), torch.div(row.size(0),3,rounding_mode='floor'), x.size(-1)).to(x.device)
    idx = row.unsqueeze(0).unsqueeze(-1).expand(-1,-1,out.size(-1))
    out2 = torch.scatter_add(out2, dim, idx, out)

I tested by removing torch.scatter_add and output 'out'. If I give the same image input to the model, 'out' is the same every time. But with scatter_add, the output(out2) changes every time.

When I was converting the PyTorch model to ONNX, I get some warnings. These warnings are not caused by scatter_add

/dfs/data/pytorch/torch/onnx/_patch_torch.py:70: UserWarning: The shape inference of prim::Constant type is missing, so it may result in wrong shape inference for the exported graph. Please consider adding it in symbolic function. (Triggered internally at /dfs/data/pytorch/torch/csrc/jit/passes/onnx/shape_type_inference.cpp:1880.)
  _C._jit_pass_onnx_node_shape_type_inference(
/dfs/data/pytorch/torch/onnx/utils.py:652: UserWarning: The shape inference of prim::Constant type is missing, so it may result in wrong shape inference for the exported graph. Please consider adding it in symbolic function. (Triggered internally at /dfs/data/pytorch/torch/csrc/jit/passes/onnx/shape_type_inference.cpp:1880.)
  _C._jit_pass_onnx_graph_shape_type_inference(
/dfs/data/pytorch/torch/onnx/utils.py:1106: UserWarning: The shape inference of prim::Constant type is missing, so it may result in wrong shape inference for the exported graph. Please consider adding it in symbolic function. (Triggered internally at /dfs/data/pytorch/torch/csrc/jit/passes/onnx/shape_type_inference.cpp:1880.)
  _C._jit_pass_onnx_graph_shape_type_inference(

Then I converted onnx file to OpenVINO xml/bin/map and the inference code are:

from openvino.runtime import Core
class HandMesh():
    def __init__(self,model_xml=None,model_bin=None,mano_path=None):
        self.model_xml = model_xml
        self.model_bin = model_bin
        self.ie = Core()
        self.model = self.ie.read_model(model=self.model_xml, weights=self.model_bin)
        self.compiled_model_ir = self.ie.compile_model(model=self.model, device_name="CPU")
        self.infer_request = self.compiled_model_ir.create_infer_request()
        self.input_layer_ir = next(iter(self.compiled_model_ir.inputs))
        self.output_layer_ir_1 = self.compiled_model_ir.outputs[0]
        self.output_layer_ir_2 = self.compiled_model_ir.outputs[1]

    def predict(self, image):
        assert image is not None
        image = cv2.resize(image, (128, 128))
        image = np.transpose(image, (2, 0, 1))
        image = image.reshape(1, 3, 128, 128)
        image = image.astype(np.float32)
        image = image / 255.0
        image = image - 0.5
        image = image / 0.5
        image = image.astype(np.float32)
        joint2d = self.compiled_model_ir([image])[self.output_layer_ir_1]
        verts3d = self.compiled_model_ir([image])[self.output_layer_ir_2]
        return joint2d,verts3d

The output 'verts3d' varies everytime scatter_add in Netron: image

Issue submission checklist
zulkifli-halim commented 1 year ago

Hi @Haiboku233, could you share with us this information for replication purposes:

Haiboku233 commented 1 year ago

Thanks for your reply. The PyTorch to ONNX code is:

import torch
from collections import OrderedDict
import os
import sys
import numpy as np
from torch import nn
import cv2
sys.path.insert(0, os.path.abspath(os.path.join(os.path.dirname(__file__), '..')))
from mobrecon.build import build_model, build_dataset
from mobrecon.configs.config import get_cfg
from options.cfg_options import CFGOptions
from mobrecon.runner import Runner
import os.path as osp
from utils import utils
from utils.writer import Writer
import torch.backends.cudnn as cudnn
from torch.utils.data import DataLoader
from tensorboardX import SummaryWriter

def setup(args):
    """
    Create configs and perform basic setups.
    """
    cfg = get_cfg()
    cfg.merge_from_file(args.config_file)
    cfg.merge_from_list(args.opts)
    cfg.freeze()
    # default_setup(cfg, args)
    return cfg

def load_weights_from_directory(model, weight_path) -> int:
    if weight_path.endswith('.pth'):
        wp = weight_path
    else:
        wps = sorted(os.listdir(weight_path), key=lambda x: int(x.split('_')[0]))
        if wps:
            wp = wps[-1]
        else:
            return 0

    print(f"Loading weights from {wp}...")
    model.load_state_dict(torch.load(os.path.join(weight_path, wp)))
    return int(wp.split('/')[-1].split('_')[0])

args = CFGOptions().parse()

cfg = setup(args)

    # device
args.rank = 0
args.world_size = 1
args.n_threads = 4
if -1 in cfg.TRAIN.GPU_ID or not torch.cuda.is_available():
    device = torch.device('cpu')
    print('CPU mode')
elif len(cfg.TRAIN.GPU_ID) == 1:
    device = torch.device('cuda', cfg.TRAIN.GPU_ID[0])
    print('CUDA ' + str(cfg.TRAIN.GPU_ID) + ' Used')
else:
    raise Exception('Do not support multi-GPU training')
cudnn.benchmark = True
cudnn.deterministic = False  #FIXME

# print config
if args.rank == 0:
    print(cfg)
    print(args.exp_name)
exec('from mobrecon.models.{} import {}'.format(cfg.MODEL.NAME.lower(), cfg.MODEL.NAME))
exec('from mobrecon.datasets.{} import {}'.format(cfg.TRAIN.DATASET.lower(), cfg.TRAIN.DATASET))
exec('from mobrecon.datasets.{} import {}'.format(cfg.VAL.DATASET.lower(), cfg.VAL.DATASET))

# model
model = build_model(cfg).to(device)

# optim
optimizer = torch.optim.Adam(model.parameters(), lr=cfg.TRAIN.LR, weight_decay=cfg.TRAIN.WEIGHT_DECAY)

# resume
model_path = '/dfs/data/code/HandMesh/mobrecon/out/MultipleDatasets/mrc_ds/checkpoint_last.pt'
checkpoint = torch.load(model_path, map_location=device)
model.load_state_dict(checkpoint['model_state_dict'])
optimizer.load_state_dict(checkpoint['optimizer_state_dict'])
epoch = checkpoint['epoch'] + 1
print('Resume from: {}, start epoch: {}'.format(model_path, epoch))

model.eval()

input_names = ["input_0"]
output_names = ["output_0","output_1"]

x=torch.randn((1,3,128,128)).to(device)

torch.onnx.export(model,(x),'model_16_aimaster_custom_grid_sample.onnx',opset_version=16,input_names=input_names,output_names=output_names )

I copied some lines from the original code and didn't check which every line is necessary or not. The model I used is mobrecon, which is defined in https://github.com/SeanChenxy/HandMesh/tree/main/mobrecon I trained it on Fereihand dataset.

I don't know what exactly you mean by ‘Model Optimizer Command’ , Then I just tried the openvino-dev 2022.2.0.dev20220829 with mo_onnx.py to convert it int xml and IR

The inference code is

from __future__ import print_function
import cv2
import numpy as np
import time
import logging as log
#from openvino.inference_engine import IECore
from openvino.runtime import Core
import pickle
import glob
from kinematics import mano_to_mpii
from registration import registration
import json

def get_calib(K,scale):
    '''
    input:
    K K_matric
    scale ?
    '''
    K = np.array(K)
    princpt = K[0:2, 2].astype(np.float32)
    focal = np.array( [K[0, 0], K[1, 1]], dtype=np.float32)
    bbox = [128//2-50, 128//2-50, 100, 100]
    center = [bbox[0]+bbox[2]*0.5, bbox[1]+bbox[3]*0.5]
    w, h = bbox[2], bbox[3]
    bbox = [center[0]-0.5 * max(w, h), center[1]-0.5 * max(w, h), max(w, h), max(w, h)]
    focal = focal * 128 / (bbox[2]*1.3)
    calib = np.eye(4)
    calib[0, 0] = focal[0]
    calib[1, 1] = focal[1]
    calib[:2, 2:3] = princpt[:, None]
    return calib

class HandMesh():
    def __init__(self, use_onnx=False,model_path=None,model_xml=None,model_bin=None,mano_path=None):
        assert model_path is not None
        self.model_path = model_path
        self.model_xml = model_xml
        self.model_bin = model_bin
        self.ie = Core()
        if use_onnx:
            self.model = self.ie.read_model(model=self.model_path)
        else:
            self.model = self.ie.read_model(model=self.model_xml, weights=self.model_bin)
        self.compiled_model_ir = self.ie.compile_model(model=self.model, device_name="CPU")
        self.infer_request = self.compiled_model_ir.create_infer_request()
        self.input_layer_ir = next(iter(self.compiled_model_ir.inputs))
        self.output_layer_ir_1 = self.compiled_model_ir.outputs[0]
        self.output_layer_ir_2 = self.compiled_model_ir.outputs[1]

        assert mano_path is not None
        with open(mano_path, 'rb') as f:
            mano = pickle.load(f, encoding='latin1')
        self.j_regressor = np.zeros([21, 778])
        self.j_regressor[:16] = mano['J_regressor'].toarray()
        for k, v in {16: 333, 17: 444, 18: 672, 19: 555, 20: 744}.items():
            self.j_regressor[k, v] = 1
        self.std = 0.20

    def predict(self, image):
        assert image is not None
        image = cv2.resize(image, (128, 128))
        image = np.transpose(image, (2, 0, 1))
        image = image.reshape(1, 3, 128, 128)
        image = image.astype(np.float32)
        image = image / 255.0
        image = image - 0.5
        image = image / 0.5
        image = image.astype(np.float32)
        joint2d = self.compiled_model_ir([image])[self.output_layer_ir_1]
        verts3d = self.compiled_model_ir([image])[self.output_layer_ir_2]
        return joint2d,verts3d

def eval_and_save(model,img_dir):
    assert model is not None
    assert img_dir is not None
    img_list = glob.glob(img_dir+"/*.jpg")
    xyz_pred_list, verts_pred_list = list(), list()

    K_file = '/workspace/datasets/HandPoseDatasets/FreiHAND_pub_v2/evaluation_K.json'
    with open (K_file, 'r') as f:
        K_list = json.load(f)
    s_file = '/workspace/datasets/HandPoseDatasets/FreiHAND_pub_v2/evaluation_scale.json'
    with open (s_file, 'r') as f:
        s_list = json.load(f)

    for i,img_path in enumerate(sorted(img_list)):
        img = cv2.imread(img_path)
        print(img_path)
        K = np.array(K_list[i])
        s =  s_list[i]
        calib = get_calib(K,s)
        assert img is not None
        j2d,v3d = model.predict(img)
        for z in range(10):
            j2d_,v3d_ = model.predict(img)
            if not (v3d==v3d_).all():
                print('*****************')
                print(v3d_)
        j2d = (j2d * 128)[0]
        v3d = v3d[0]

        v3d_pred, align_state = registration(v3d, j2d, model.j_regressor, calib, 128, poly=None)
        j2d_cam = mano_to_mpii(np.matmul(model.j_regressor, v3d_pred))
        return j2d_cam

if __name__ == '__main__':

    model = HandMesh(use_onnx=False,model_path = "",model_xml = "/workspace/HandMesh/model_16_aimaster_torch_grid_sample.xml",model_bin = "/workspace/HandMesh/model_16_aimaster_torch_grid_sample.bin",mano_path="/workspace/HandMesh/mano_v1_2/models/MANO_RIGHT.pkl")
    eval_and_save(model,"/workspace/HandMesh/data/FreiHAND_pub_v2/evaluation/rgb")
zulkifli-halim commented 1 year ago

HI @Haiboku233,

I tried to retrain the Mobrecon model and it's time-consuming with the setup and training wise. It would be great if you could provide me with the model (.onnx and IR format).

Haiboku233 commented 1 year ago

Hi @zulkifli-halim
I sent my files to your e-mail zulkiflix.bin.abdul.halim@intel.com last Thursday and haven't get any reply with my email jw.elliot@foxmail.com Please tell me if you get the files, or should I resend the files or try another way.

zulkifli-halim commented 1 year ago

Hi @Haiboku233, I did run a test on your model using Benchmark Python Tool and encountered this error:

[Step 2/11] Loading OpenVINO
[ WARNING ] PerformanceMode was not explicitly specified in command line. Device CPU performance hint will be set to THROUGHPUT.
[ INFO ] OpenVINO:
         API version............. 2022.1.0-7019-cdb9bec7210-releases/2022/1
[ INFO ] Device info
         CPU
         openvino_intel_cpu_plugin version 2022.1
         Build................... 2022.1.0-7019-cdb9bec7210-releases/2022/1

[Step 3/11] Setting device configuration
[ WARNING ] -nstreams default value is determined automatically for CPU device. Although the automatic selection usually provides a reasonable performance, but it still may be non-optimal for some cases, for more information look at README.
[Step 4/11] Reading network files
[ ERROR ] Cannot create GridSample layer /decoder3d/GridSample id:752 from unsupported opset: opset9
Traceback (most recent call last):
  File "c:\users\zbinab5x\downloads\github\openvino_env\lib\site-packages\openvino\tools\benchmark\main.py", line 256, in run
    model = benchmark.read_model(args.path_to_model)
  File "c:\users\zbinab5x\downloads\github\openvino_env\lib\site-packages\openvino\tools\benchmark\benchmark.py", line 62, in read_model
    return self.core.read_model(model_filename, weights_filename)
RuntimeError: Cannot create GridSample layer /decoder3d/GridSample id:752 from unsupported opset: opset9

Seems like the unsupported opset version was used. For your info, the latest opset version supported by OpenVINO is opset8.

Could you share your full inference code for us to test our side?

Haiboku233 commented 1 year ago

Thanks a lot. It seems in your test the gridsample op from torch is unsupported, I didn't get the same error but I also tried a custom grid sample op. By the way the openvino version I used is 2022.2 and I'm not sure if that makes any difference. I'll send my files and code later this week. Thanks again.

zulkifli-halim commented 1 year ago

Hi @Haiboku233, the latest offsets version supported by OpenVINO 2022.2 is Opsets 9.

I ran a test on your model_16_torch_grid_sample model with _benchmarkapp and the previous error is resolved when using the latest OpenVINO (v2022.2).

[Step 9/11] Creating infer requests and preparing input data
[ INFO ] Create 4 infer requests took 0.00 ms
[ WARNING ] No input files were given for input 'input_0'!. This input will be filled with random values!
[ INFO ] Fill input 'input_0' with random values
[Step 10/11] Measuring performance (Start inference asynchronously, 4 inference requests using 4 streams for CPU, inference only: True, limits: 60000 ms duration)
[ INFO ] Benchmarking in inference only mode (inputs filling are not included in measurement loop).
[ INFO ] First inference took 32.19 ms
[Step 11/11] Dumping statistics report
Count:          5832 iterations
Duration:       60079.32 ms
Latency:
    Median:     40.62 ms
    AVG:        41.08 ms
    MIN:        23.76 ms
    MAX:        117.55 ms
Throughput: 97.07 FPS

I executed your inference script but no output is shown: image

Can you share the steps you used to get a display of the inference results?

Haiboku233 commented 1 year ago

Hi @zulkifli-halim , the scatter_add I used is supported in opset16, so is it not supported by OpenVINO?

I checked the openvino_inference_test.py I shared and find I forget to change the filepath to images and models files. This test code just read .jpg images and inference with one image 10 times. Or you can try this one

from __future__ import print_function
import cv2
import numpy as np
from openvino.runtime import Core
import glob

class HandMesh():
    def __init__(self, use_onnx=False,model_path=None,model_xml=None,model_bin=None):
        assert model_path is not None
        self.model_path = model_path
        self.model_xml = model_xml
        self.model_bin = model_bin
        self.ie = Core()
        if use_onnx:
            self.model = self.ie.read_model(model=self.model_path)
        else:
            self.model = self.ie.read_model(model=self.model_xml, weights=self.model_bin)
        self.compiled_model_ir = self.ie.compile_model(model=self.model, device_name="CPU")
        self.infer_request = self.compiled_model_ir.create_infer_request()
        self.input_layer_ir = next(iter(self.compiled_model_ir.inputs))
        self.output_layer_ir_1 = self.compiled_model_ir.outputs[0]
        self.output_layer_ir_2 = self.compiled_model_ir.outputs[1]

    def predict(self, image):
        assert image is not None
        image = cv2.resize(image, (128, 128))
        image = np.transpose(image, (2, 0, 1))
        image = image.reshape(1, 3, 128, 128)
        image = image.astype(np.float32)
        image = image / 255.0
        image = image - 0.5
        image = image / 0.5
        image = image.astype(np.float32)
        joint2d = self.compiled_model_ir([image])[self.output_layer_ir_1]
        verts3d = self.compiled_model_ir([image])[self.output_layer_ir_2]
        return joint2d,verts3d

def eval_and_save(model,img_dir):
    assert model is not None
    assert img_dir is not None
    img_list = glob.glob(img_dir+"/*.jpg")
    assert len(img_list) > 0
    for i,img_path in enumerate(sorted(img_list)):
        img = cv2.imread(img_path)
        assert img is not None
        j2d,v3d = model.predict(img)
        for z in range(10):
            j2d_,v3d_ = model.predict(img)
            if not (v3d==v3d_).all():
                print('*****************')
                print(v3d_.shape)
                print(v3d_)    

if __name__ == '__main__':

    model = HandMesh(use_onnx=False,model_path = "",model_xml = "./model_16_sim_custom_gridsample.xml",model_bin = "./model_16_sim_custom_gridsample.bin")
    eval_and_save(model,"path_to_jpg_img_dir")

If you set the right filepath and the output is the same everytime, then it seems to be some other problems.

zulkifli-halim commented 1 year ago

Hi @Haiboku233, I ran a test on my side and these are the results:

Inference result on ONNX model: (1, 778, 3)

[[[ 0.00056796  0.00086615 -0.00049026]
  [-0.00184045  0.00092172 -0.00035382]
  [-0.00281669  0.00185258 -0.00010592]
  ...
  [ 0.00068144 -0.00278673  0.00359513]
  [ 0.00200607 -0.00046002  0.00147544]
  [ 0.00053961 -0.00127195  0.00279348]]]
*****************
(1, 778, 3)
[[[ 0.00056796  0.00086615 -0.00049026]
  [-0.00184045  0.00092172 -0.00035382]
  [-0.00281669  0.00185258 -0.00010592]
  ...
  [ 0.00068144 -0.00278673  0.00359513]
  [ 0.00200607 -0.00046002  0.00147544]
  [ 0.00053961 -0.00127195  0.00279348]]]
*****************
(1, 778, 3)
[[[ 5.6796207e-04  8.6614920e-04 -4.9025775e-04]
  [-1.8404517e-03  9.2171814e-04 -3.5381911e-04]
  [-2.8166911e-03  1.8525772e-03 -1.0592118e-04]
  ...
  [ 7.5678725e-04 -2.5494955e-03  3.5739953e-03]
  [ 2.2375169e-03  2.8182170e-05  1.7327364e-03]
  [ 5.3960935e-04 -1.2719485e-03  2.7934834e-03]]]

Inference result on IR model:

(1, 778, 3)
[[[-2.5344116e-04  1.7710326e-03 -3.0294806e-04]
  [-1.3561994e-03  1.3739488e-03  3.3429824e-05]
  [-2.8502282e-03  2.5464855e-03  1.0203267e-03]
  ...
  [ 1.2690985e-03  3.9094877e-03  5.6340634e-03]
  [ 2.1015289e-03  2.3890054e-03  3.9047855e-03]
  [ 1.8179026e-03  2.3238617e-03  4.2560250e-03]]]
*****************
(1, 778, 3)
[[[-1.1728547e-03  6.9052726e-04 -1.2574936e-03]
  [-1.8297451e-03  9.5951516e-04 -5.7171099e-04]
  [-2.1471300e-03  1.9803189e-03  1.2230361e-04]
  ...
  [-4.4262136e-04  7.7918274e-05  3.1734083e-04]
  [-1.1621662e-03  2.2938068e-04  1.6079702e-03]
  [-1.3403315e-04  4.5344059e-06  3.9588934e-04]]]
*****************
(1, 778, 3)
[[[-4.4556631e-04  1.1699977e-03 -4.3296756e-04]
  [-1.3738963e-03  7.2025601e-04 -7.1002985e-05]
  [-2.0341345e-03  2.2115421e-03  2.7357216e-04]
  ...
  [ 1.6978555e-03  4.0211072e-03  5.9065642e-03]
  [ 1.8161191e-03  2.2866412e-03  4.0267226e-03]
  [ 1.7080385e-03  2.5018803e-03  4.1645262e-03]]]

As we can see, the IR inference outputs are different each time. We are checking this out and will get back to you soon.

hbalasu1 commented 1 year ago

Hi @Haiboku233

I convert your onnx model to IR file with the following command mo --input_model model_16_sim_custom_gridsample.onnx --input_shape [1,3,128,128] --scale 255 --reverse_input_channels --data_type FP16 --use_new_frontend

When I test both onnx and IR file, with repetition, both give almost expected results, except when I repeat the test with the same model and same single images, the prediction will show different values. Is this something expected from your end?

Below is the results;

**>python inference_onnx.py**
*****************
(1, 778, 3)
[[[ 0.00117834  0.00011092 -0.001201  ]
  [ 0.00047875 -0.00114322 -0.00158392]
  [ 0.00035195 -0.00095018 -0.00190924]
  ...
  [-0.00096638 -0.00448504  0.00205565]
  [ 0.00035037 -0.00169349  0.00107424]
  [-0.0002833  -0.00253445  0.00129564]]]
*****************
(1, 778, 3)
[[[ 0.00117834  0.00011092 -0.001201  ]
  [ 0.00047875 -0.00114322 -0.00158392]
  [ 0.00035195 -0.00095018 -0.00190924]
  ...
  [-0.00096181 -0.00460397  0.00208835]
  [ 0.00033991 -0.00200991  0.000898  ]
  [-0.0002833  -0.00253445  0.00129564]]]
*****************
(1, 778, 3)
[[[ 1.4940881e-04 -4.2245848e-04 -1.3887269e-03]
  [ 2.7033631e-04 -7.8861037e-04 -1.4924422e-03]
  [ 8.0322317e-04 -4.7992836e-04 -1.3703353e-03]
  ...
  [ 2.1480573e-05 -8.0045714e-04 -1.6870211e-04]
  [-2.2932159e-04 -9.5473073e-04 -4.4732730e-04]
  [ 4.4911072e-04 -9.3884382e-04  5.7122408e-04]]]
*****************
(1, 778, 3)
[[[ 0.00117834  0.00011092 -0.001201  ]
  [ 0.00047875 -0.00114322 -0.00158392]
  [ 0.00035195 -0.00095018 -0.00190924]
  ...
  [-0.00096181 -0.00460397  0.00208835]
  [ 0.00033991 -0.00200991  0.000898  ]
  [-0.0002833  -0.00253445  0.00129564]]]
*****************
(1, 778, 3)
[[[ 0.00126598  0.00029918 -0.00187551]
  [ 0.0006474  -0.00054866 -0.0025639 ]
  [ 0.00072418 -0.00060448 -0.00289274]
  ...
  [-0.00071078 -0.00454852 -0.00078819]
  [ 0.00103178 -0.00209885 -0.0010004 ]
  [-0.00021117 -0.00274698 -0.00063311]]]
*****************
(1, 778, 3)
[[[ 0.00117834  0.00011092 -0.001201  ]
  [ 0.00047875 -0.00114322 -0.00158392]
  [ 0.00035195 -0.00095018 -0.00190924]
  ...
  [-0.00096181 -0.00460397  0.00208835]
  [ 0.00033991 -0.00200991  0.000898  ]
  [-0.0002833  -0.00253445  0.00129564]]]
*****************
(1, 778, 3)
[[[ 0.00117834  0.00011092 -0.001201  ]
  [ 0.00047875 -0.00114322 -0.00158392]
  [ 0.00035195 -0.00095018 -0.00190924]
  ...
  [-0.00097791 -0.00455515  0.00207817]
  [ 0.00031192 -0.00188786  0.00102242]
  [-0.0002833  -0.00253445  0.00129564]]]
*****************
(1, 778, 3)
[[[ 0.00117834  0.00011092 -0.001201  ]
  [ 0.00047875 -0.00114322 -0.00158392]
  [ 0.00035195 -0.00095018 -0.00190924]
  ...
  [-0.00096826 -0.00448382  0.00205822]
  [ 0.00034969 -0.00169181  0.00107673]
  [-0.00028417 -0.00253444  0.0012961 ]]]
*****************
(1, 778, 3)
[[[ 0.00139788  0.00029908 -0.00097753]
  [ 0.00061316 -0.00098916 -0.00102203]
  [ 0.00067279 -0.00130768 -0.00163862]
  ...
  [-0.00116835 -0.00387665  0.00192199]
  [ 0.00021227 -0.00147009  0.00099383]
  [-0.00047681 -0.00179707  0.00107562]]]
*****************
(1, 778, 3)
[[[ 0.00117834  0.00011092 -0.001201  ]
  [ 0.00047875 -0.00114322 -0.00158392]
  [ 0.00035195 -0.00095018 -0.00190924]
  ...
  [-0.0009718  -0.00450003  0.00205283]
  [ 0.00033339 -0.00170837  0.00109009]
  [-0.0002833  -0.00253445  0.00129564]]]

**>python inference_xml.py**
*****************
(1, 778, 3)
[[[ 2.7662446e-04  5.7261321e-04  7.5354497e-04]
  [ 4.1505195e-05  5.9455005e-04  1.3179055e-03]
  [ 1.1650788e-04  1.3296932e-03  2.3534812e-03]
  ...
  [-1.6604476e-04 -5.3719763e-04  1.6929815e-03]
  [-9.4967126e-04 -3.1675937e-04  2.1235079e-03]
  [-1.1679401e-05 -2.7240766e-04  8.1743562e-04]]]
*****************
(1, 778, 3)
[[[ 7.2146089e-05  2.7044659e-04  2.4623584e-04]
  [ 2.1964441e-04  1.1672803e-04  5.8634352e-04]
  [ 1.1445570e-05  1.2613133e-03  1.9134757e-03]
  ...
  [ 4.1263952e-04 -7.0542644e-04  1.7466511e-03]
  [ 7.8088586e-04  3.9543072e-04  1.2834583e-03]
  [ 2.9366399e-04 -5.8332348e-04  5.6395523e-04]]]
*****************
(1, 778, 3)
[[[-9.4955343e-05  1.4695518e-04  3.3534772e-04]
  [-6.8157329e-04 -1.0950181e-04  9.9923485e-04]
  [-3.1467935e-04  4.3296831e-04  1.2473905e-03]
  ...
  [ 9.0491743e-04  1.2492849e-03  7.2114165e-03]
  [ 5.2616984e-04  1.2193528e-03  5.7539679e-03]
  [ 8.2659291e-04  1.2413110e-03  5.3215730e-03]]]
*****************
(1, 778, 3)
[[[ 3.0101649e-04  2.5016503e-04  3.0054315e-04]
  [ 4.1631295e-04  9.9016397e-05  7.2899851e-04]
  [ 7.4297182e-05  1.2051177e-03  1.8472320e-03]
  ...
  [ 1.5887951e-04 -3.6324613e-04  8.5485232e-04]
  [-8.6414971e-04  4.2256821e-04  1.8575588e-03]
  [ 6.4432003e-05 -4.3413925e-04  7.5682532e-04]]]
*****************
(1, 778, 3)
[[[ 1.7370455e-04 -1.5292111e-04 -8.3383311e-05]
  [ 2.5746488e-04 -3.2176092e-04  2.7347618e-04]
  [-6.3048309e-04  2.0727137e-04  4.1535663e-04]
  ...
  [ 5.0259574e-04 -5.7715079e-04  6.9113239e-04]
  [-7.2669028e-04 -2.0715715e-05  1.6282055e-03]
  [ 3.0390004e-04 -4.7769950e-04  5.4282858e-04]]]
*****************
(1, 778, 3)
[[[-6.7343804e-05  2.4919247e-04 -4.5944084e-04]
  [-4.0032073e-06 -1.3321427e-04  3.0498530e-04]
  [-6.2254003e-05  6.2119163e-04  1.4126803e-03]
  ...
  [ 1.4634181e-03  1.6173349e-03  3.6573454e-03]
  [ 1.1743099e-03  1.6406793e-03  2.4440070e-03]
  [ 2.3556813e-03  1.0372078e-03  2.5743518e-03]]]
*****************
(1, 778, 3)
[[[ 1.9258210e-04  3.1604162e-05  9.4349743e-05]
  [ 8.4798638e-05 -1.9009564e-04  5.1905366e-04]
  [ 6.3295382e-05  2.0201365e-04  1.2552148e-03]
  ...
  [ 8.8003471e-05  6.2389779e-05  6.6724600e-04]
  [-8.5091131e-04  7.3247321e-04  1.3806368e-03]
  [-1.4349338e-04  9.7831893e-05  1.2720609e-04]]]
*****************
(1, 778, 3)
[[[ 3.4303556e-04  6.0284312e-04 -1.5424415e-04]
  [-7.6941185e-05  4.7793183e-05  4.2824759e-04]
  [-6.0634740e-04  6.5495138e-04  1.0131660e-03]
  ...
  [ 3.9383955e-04 -5.4320385e-04  6.8783405e-04]
  [-8.5087150e-04  2.7421964e-04  1.6888909e-03]
  [ 1.9851731e-04 -5.5722112e-04  6.2757841e-04]]]
*****************
(1, 778, 3)
[[[ 7.8857433e-05 -4.6606202e-04  2.9818722e-04]
  [-7.7330123e-04 -4.5505504e-04  9.6907379e-04]
  [-3.9085709e-05  1.1921298e-03  1.6166430e-03]
  ...
  [ 3.6988207e-04  6.1647868e-04  5.3588641e-03]
  [ 2.1067134e-04  3.7457555e-04  4.9227029e-03]
  [-5.8447313e-04  6.5027905e-04  3.8211604e-03]]]
*****************
(1, 778, 3)
[[[ 7.7220314e-04  3.4331737e-04  5.8286922e-04]
  [ 4.5966645e-04  2.2011764e-05  6.0962158e-04]
  [-5.1400723e-04  1.1725683e-03  1.8486324e-03]
  ...
  [ 4.2624885e-04 -2.7036096e-04  3.5484619e-03]
  [-3.1115260e-04  2.0864731e-04  3.4628119e-03]
  [ 5.8124273e-04  1.1216276e-04  2.5108203e-03]]]
Haiboku233 commented 1 year ago

Hi @hbalasu1, Thanks for your time. No, the predictions are expected to be the same with the same model and same single image. (1) I tested the .onnx file with onnxruntime==1.12.1, it works fine(same model,same input,same predictions every time), you could check it easily. (2)If the scatter_add op and subsequent layers are removed, the output is correct and keeps the same in a loop with the same model and same input image (the same .onnx file tested with both onnxruntime and Openvino) So I get confused.

akladiev commented 1 year ago

This issue will be closed in 2 weeks in case of no activity.

avitial commented 11 months ago

Closing this as it seems no further action is needed. Feel free to reopen to ask any questions related to this topic.