apache / mxnet

Lightweight, Portable, Flexible Distributed/Mobile Deep Learning with Dynamic, Mutation-aware Dataflow Dep Scheduler; for Python, R, Julia, Scala, Go, Javascript and more
https://mxnet.apache.org
Apache License 2.0
20.78k stars 6.79k forks source link

MKL-DNN QuantizedFullyConnectedOp Error #14467

Closed Soonhwan-Kwon closed 5 years ago

Soonhwan-Kwon commented 5 years ago

Description

When using FusedRNNCell + MKLDNN backend: Graph optimization and Quantization (experimental), it leads to the QuantizedFullyConnectedOp Error like below,

MXNetError: Error in operator quantized_fusedrnn_t134_i2h: [11:40:16] src/operator/quantization/quantized_fully_connected.cc:41: Check failed: !shape_is_none(in_shape->at(0)) QuantizedFullyConnectedOp input data shape must be given

and below is pseudo code for Network Architecture

stack = mx.rnn.FusedRNNCell(1760, num_layers=num_layers,
                                              mode=fused_rnn_mode, prefix='',
                                              bidirectional=bidirectional).unfuse()
net, _ = stack.unroll(length=seq_lengths_references[-1],
                            inputs=net,
                            merge_outputs=False,
                            layout='TNC'
                         )

Quantization Code net = net.get_backend_symbol('MKLDNN')

qnet, qarg_params, qaux_params = quantize_model(sym=net, arg_params={}, aux_params={},ctx=mx.cpu(0), calib_mode='none', quantized_dtype='int8')

Stack trace returned 10 entries: [bt] (0) /home/ubuntu/anaconda2/envs/mxnet_1_4/lib/python2.7/site-packages/mxnet/libmxnet.so(+0x3e95ea) [0x7faf97f2b5ea] [bt] (1) /home/ubuntu/anaconda2/envs/mxnet_1_4/lib/python2.7/site-packages/mxnet/libmxnet.so(+0x3e9c11) [0x7faf97f2bc11] [bt] (2) /home/ubuntu/anaconda2/envs/mxnet_1_4/lib/python2.7/site-packages/mxnet/libmxnet.so(+0x9e351c) [0x7faf9852551c] [bt] (3) /home/ubuntu/anaconda2/envs/mxnet_1_4/lib/python2.7/site-packages/mxnet/libmxnet.so(+0x2deed5a) [0x7faf9a930d5a] [bt] (4) /home/ubuntu/anaconda2/envs/mxnet_1_4/lib/python2.7/site-packages/mxnet/libmxnet.so(+0x2df1704) [0x7faf9a933704] [bt] (5) /home/ubuntu/anaconda2/envs/mxnet_1_4/lib/python2.7/site-packages/mxnet/libmxnet.so(MXSymbolInferShape+0x15ba) [0x7faf9a89e40a] [bt] (6) /home/ubuntu/anaconda2/envs/mxnet_1_4/lib/python2.7/lib-dynload/../../libffi.so.6(ffi_call_unix64+0x4c) [0x7fafff188ec0] [bt] (7) /home/ubuntu/anaconda2/envs/mxnet_1_4/lib/python2.7/lib-dynload/../../libffi.so.6(ffi_call+0x22d) [0x7fafff18887d] [bt] (8) /home/ubuntu/anaconda2/envs/mxnet_1_4/lib/python2.7/lib-dynload/_ctypes.so(_ctypes_callproc+0x4de) [0x7fafff39f8de] [bt] (9) /home/ubuntu/anaconda2/envs/mxnet_1_4/lib/python2.7/lib-dynload/_ctypes.so(+0x9b31) [0x7fafff395b31]

commit head mxnet-cu90mkl 1.4.0.post0

mxnet-label-bot commented 5 years ago

Hey, this is the MXNet Label Bot. Thank you for submitting the issue! I will try and suggest some labels so that the appropriate MXNet community members can help resolve it. Here are my recommended labels: Build

TaoLv commented 5 years ago

@Soonhwan-Kwon Could you provide a simple reproducer?

Also, have you ever tried sym = sym.get_backend_symbol('MKLDNN_FC') and used quantized_dtype='uint8' when call quantize_model? https://github.com/apache/incubator-mxnet/blob/master/example/quantization/imagenet_gen_qsym_mkldnn.py#L183

Soonhwan-Kwon commented 5 years ago

@TaoLv We tried your suggestion before,

$ echo $MXNET_SUBGRAPH_BACKEND
MKLDNN

sym = sym.get_backend_symbol('MKLDNN')
sym = sym.get_backend_symbol('MKLDNN_FC')

and it produces error like below MXNetError Traceback (most recent call last)

in () 1 sym = sym.get_backend_symbol('MKLDNN') ----> 2 sym = sym.get_backend_symbol('MKLDNN_FC') /home/ubuntu/anaconda2/envs/mxnet_1_4/lib/python2.7/site-packages/mxnet/symbol/symbol.pyc in get_backend_symbol(self, backend) 2454 """ 2455 out = SymbolHandle() -> 2456 check_call(_LIB.MXGenBackendSubgraph(self.handle, c_str(backend), ctypes.byref(out))) 2457 return Symbol(out) 2458 /home/ubuntu/anaconda2/envs/mxnet_1_4/lib/python2.7/site-packages/mxnet/base.pyc in check_call(ret) 250 """ 251 if ret != 0: --> 252 raise MXNetError(py_str(_LIB.MXGetLastError())) 253 254 MXNetError: [16:35:10] src/c_api/../operator/subgraph/subgraph_property.h:165: Check failed: it != prop_fn_map_.end() SubgraphProperty MKLDNN_FC is not found in SubgraphPropertyRegistry Stack trace returned 10 entries: [bt] (0) /home/ubuntu/anaconda2/envs/mxnet_1_4/lib/python2.7/site-packages/mxnet/libmxnet.so(+0x3e95ea) [0x7fc8dad265ea] [bt] (1) /home/ubuntu/anaconda2/envs/mxnet_1_4/lib/python2.7/site-packages/mxnet/libmxnet.so(+0x3e9c11) [0x7fc8dad26c11] [bt] (2) /home/ubuntu/anaconda2/envs/mxnet_1_4/lib/python2.7/site-packages/mxnet/libmxnet.so(MXGenBackendSubgraph+0x40b) [0x7fc8dd6911fb] [bt] (3) /home/ubuntu/anaconda2/envs/mxnet_1_4/lib/python2.7/lib-dynload/../../libffi.so.6(ffi_call_unix64+0x4c) [0x7fc94220bec0] [bt] (4) /home/ubuntu/anaconda2/envs/mxnet_1_4/lib/python2.7/lib-dynload/../../libffi.so.6(ffi_call+0x22d) [0x7fc94220b87d] [bt] (5) /home/ubuntu/anaconda2/envs/mxnet_1_4/lib/python2.7/lib-dynload/_ctypes.so(_ctypes_callproc+0x4de) [0x7fc9424228de] [bt] (6) /home/ubuntu/anaconda2/envs/mxnet_1_4/lib/python2.7/lib-dynload/_ctypes.so(+0x9b31) [0x7fc942418b31] [bt] (7) /home/ubuntu/anaconda2/envs/mxnet_1_4/bin/../lib/libpython2.7.so.1.0(PyObject_Call+0x43) [0x7fc944fe2973] [bt] (8) /home/ubuntu/anaconda2/envs/mxnet_1_4/bin/../lib/libpython2.7.so.1.0(PyEval_EvalFrameEx+0x3bb9) [0x7fc945078d49] [bt] (9) /home/ubuntu/anaconda2/envs/mxnet_1_4/bin/../lib/libpython2.7.so.1.0(PyEval_EvalCodeEx+0x7e9) [0x7fc94507e6c9] And we are working on a simple reproducer now, and will reply the code as soon as possible.
Soonhwan-Kwon commented 5 years ago

@TaoLv And also quantized_dtype='uint8' produces the same original error message

MXNetError: Error in operator quantized_fusedrnn_t134_i2h: [11:40:16] src/operator/quantization/quantized_fully_connected.cc:41: Check failed: !shape_is_none(in_shape->at(0)) QuantizedFullyConnectedOp input data shape must be given

TaoLv commented 5 years ago

May I know which version of MXNet you are using? MKL-DNN QFC is merged into master recently. PR here: https://github.com/apache/incubator-mxnet/pull/14128

Soonhwan-Kwon commented 5 years ago

@TaoLv we tried version of 1.4.0.post0 which was the version before the commit, we'll try the latest version as you mentioned right now, thank you.

pengzhao-intel commented 5 years ago

@Soonhwan-Kwon Thanks to reporting the issue.

Amagong commented 5 years ago

Hi. I have tried the same problem. using "mxnet-cu90mkl 1.5.0b20190314"

First, I converted and saved a trained fused-rnn model.

import argparse
import os
import logging
import mxnet as mx
import gluoncv
from mxnet import gluon, nd, image
from gluoncv import utils
from gluoncv.model_zoo import get_model
from mxnet.contrib.quantization import *
from mxnet.base import SymbolHandle, check_call, _LIB, mx_uint, c_str_array
import ctypes

def save_symbol(fname, sym, logger=None):
    if logger is not None:
        logger.info('Saving symbol into file at %s' % fname)
    sym.save(fname)

def save_params(fname, arg_params, aux_params, logger=None):
    if logger is not None:
        logger.info('Saving params into file at %s' % fname)
    save_dict = {('arg:%s' % k): v.as_in_context(cpu()) for k, v in arg_params.items()}
    save_dict.update({('aux:%s' % k): v.as_in_context(cpu()) for k, v in aux_params.items()})
    mx.nd.save(fname, save_dict)

logging.basicConfig()
logger = logging.getLogger('logger')
logger.setLevel(logging.INFO)

prefix = 'fused_rnn'
dir_path = './checkpoints/'
prefix = os.path.join(dir_path, prefix)
epoch = 173
batch_size = 900

ctx = mx.cpu(0)

# load and convert
sym, arg_params, aux_params = mx.model.load_checkpoint(prefix, epoch)

sym = sym.get_backend_symbol('MKLDNN')
sym = sym.get_backend_symbol('MKLDNN_FC')

excluded_sym_names = []
excluded_sym_names += ['conv0']

logger.info('Quantizing FP32 model %s' % prefix)
qsym, qarg_params, aux_params = quantize_model(sym=sym, arg_params=arg_params, aux_params=aux_params, excluded_sym_names=excluded_sym_names, 
                                                ctx=ctx, calib_mode='none', quantized_dtype='uint8', logger=logger)

qsym = qsym.get_backend_symbol('MKLDNN_POST_QUANTIZE')
qsym = qsym.get_backend_symbol('MKLDNN_POST_FC_QUANTIZE')

sym_name = '%s-symbol.json' % (prefix + '-quantized')
param_name = '%s-%04d.params' % (prefix + '-quantized', epoch)

save_symbol(sym_name, qsym, logger)
save_params(param_name, qarg_params, aux_params, logger)

And, I loaded the converted symbols and the params file.

import numpy as np
import mxnet as mx
import os

q_prefix = 'fused_rnn-quantized'
dir_path = './checkpints/'
q_prefix = os.path.join(dir_path, q_prefix)
epoch = 173
batch_size = 900

contexts = [mx.context.Context('cpu')]

q_symbol_file = q_prefix + '-symbol.json'

q_symbol = mx.sym.load(q_symbol_file)

q_symbol.simple_bind(ctx=mx.cpu(), data=(900, 137, 9), category=(900, 2))

When tried simple_bind, it leads to the simple_bind error like below,

RuntimeError: simple_bind error. Arguments: category: (900, 2) data: (900, 137, 9) [20:25:39] src/executor/../common/exec_utils.h:392: InferShape pass cannot decide shapes for the following arguments (0s means unknown dimensions). Please consider providing them as inputs:

Stack trace returned 10 entries: [bt] (0) /home/ubuntu/anaconda2/envs/mxnet_1_4/lib/python2.7/site-packages/mxnet/libmxnet.so(+0x421cd2) [0x7ff6c90dfcd2] [bt] (1) /home/ubuntu/anaconda2/envs/mxnet_1_4/lib/python2.7/site-packages/mxnet/libmxnet.so(+0x4222b8) [0x7ff6c90e02b8] [bt] (2) /home/ubuntu/anaconda2/envs/mxnet_1_4/lib/python2.7/site-packages/mxnet/libmxnet.so(+0x31a10f1) [0x7ff6cbe5f0f1] [bt] (3) /home/ubuntu/anaconda2/envs/mxnet_1_4/lib/python2.7/site-packages/mxnet/libmxnet.so(mxnet::exec::GraphExecutor::Init(nnvm::Symbol, mxnet::Context const&, std::map<std::string, mxnet::Context, std::less, std::allocator<std::pair<std::string const, mxnet::Context> > > const&, std::vector<mxnet::Context, std::allocator > const&, std::vector<mxnet::Context, std::allocator > const&, std::vector<mxnet::Context, std::allocator > const&, std::unordered_map<std::string, mxnet::TShape, std::hash, std::equal_to, std::allocator<std::pair<std::string const, mxnet::TShape> > > const&, std::unordered_map<std::string, int, std::hash, std::equal_to, std::allocator<std::pair<std::string const, int> > > const&, std::unordered_map<std::string, int, std::hash, std::equal_to, std::allocator<std::pair<std::string const, int> > > const&, std::vector<mxnet::OpReqType, std::allocator > const&, std::unordered_set<std::string, std::hash, std::equal_to, std::allocator > const&, std::vector<mxnet::NDArray, std::allocator >, std::vector<mxnet::NDArray, std::allocator >, std::vector<mxnet::NDArray, std::allocator >, std::unordered_map<std::string, mxnet::NDArray, std::hash, std::equal_to, std::allocator<std::pair<std::string const, mxnet::NDArray> > >, mxnet::Executor, std::unordered_map<nnvm::NodeEntry, mxnet::NDArray, nnvm::NodeEntryHash, nnvm::NodeEntryEqual, std::allocator<std::pair<nnvm::NodeEntry const, mxnet::NDArray> > > const&)+0x481) [0x7ff6cbe83101] [bt] (4) /home/ubuntu/anaconda2/envs/mxnet_1_4/lib/python2.7/site-packages/mxnet/libmxnet.so(mxnet::Executor::SimpleBind(nnvm::Symbol, mxnet::Context const&, std::map<std::string, mxnet::Context, std::less, std::allocator<std::pair<std::string const, mxnet::Context> > > const&, std::vector<mxnet::Context, std::allocator > const&, std::vector<mxnet::Context, std::allocator > const&, std::vector<mxnet::Context, std::allocator > const&, std::unordered_map<std::string, mxnet::TShape, std::hash, std::equal_to, std::allocator<std::pair<std::string const, mxnet::TShape> > > const&, std::unordered_map<std::string, int, std::hash, std::equal_to, std::allocator<std::pair<std::string const, int> > > const&, std::unordered_map<std::string, int, std::hash, std::equal_to, std::allocator<std::pair<std::string const, int> > > const&, std::vector<mxnet::OpReqType, std::allocator > const&, std::unordered_set<std::string, std::hash, std::equal_to, std::allocator > const&, std::vector<mxnet::NDArray, std::allocator >, std::vector<mxnet::NDArray, std::allocator >, std::vector<mxnet::NDArray, std::allocator >, std::unordered_map<std::string, mxnet::NDArray, std::hash, std::equal_to, std::allocator<std::pair<std::string const, mxnet::NDArray> > >, mxnet::Executor)+0x1d5) [0x7ff6cbe85835] [bt] (5) /home/ubuntu/anaconda2/envs/mxnet_1_4/lib/python2.7/site-packages/mxnet/libmxnet.so(MXExecutorSimpleBind+0x2260) [0x7ff6cbdd42f0] [bt] (6) /home/ubuntu/anaconda2/envs/mxnet_1_4/lib/python2.7/lib-dynload/../../libffi.so.6(ffi_call_unix64+0x4c) [0x7ff736c98ec0] [bt] (7) /home/ubuntu/anaconda2/envs/mxnet_1_4/lib/python2.7/lib-dynload/../../libffi.so.6(ffi_call+0x22d) [0x7ff736c9887d] [bt] (8) /home/ubuntu/anaconda2/envs/mxnet_1_4/lib/python2.7/lib-dynload/_ctypes.so(_ctypes_callproc+0x4de) [0x7ff736eaf8de] [bt] (9) /home/ubuntu/anaconda2/envs/mxnet_1_4/lib/python2.7/lib-dynload/_ctypes.so(+0x9b31) [0x7ff736ea5b31]

pengzhao-intel commented 5 years ago

@mxnet-label-bot add [MKLDNN, Quantization]

ciyongch commented 5 years ago

@Soonhwan-Kwon is there any 0 dimension in the shape of input data? quantized Fullyconnected requires all the dimension of input data are given.

pengzhao-intel commented 5 years ago

@Soonhwan-Kwon @Amagong could you provide a mini reproducible case so that we can help to resolve the issue?

Maybe you also need to patch https://github.com/apache/incubator-mxnet/pull/14466 after excluding the 0-dim layers.

Check failed: !shape_is_none(in_shape->at(0))

pengzhao-intel commented 5 years ago

The PR #14466 is merged. Please sync up the latest MXNet and build again.

Soonhwan-Kwon commented 5 years ago

@pengzhao-intel Thank you for the update. I'm rebuilding the MXNet now and @Amagong and I are working on the same project. @ciyongch we excluded embedding layer(which seems has 0 dimension) but has no effect.

ciyongch commented 5 years ago

@Soonhwan-Kwon Are you still facing the error of "Check failed: !shape_is_none(in_shape->at(0)) QuantizedFullyConnectedOp input data shape must be given" or "InferShape pass cannot decide shapes for the following arguments (0s means unknown dimensions). Please consider providing them as inputs:" ? Can you provide a reproducer, then we can take a look :)

Amagong commented 5 years ago

@ciyongch Thank you for your quick response. I'll check with the newly built version. And, I'll prepare a simple reproducer.

Amagong commented 5 years ago

There is currently an "InferShape pass cannot decide shapes for the following arguments (0s means unknown dimensions). Please consider providing them as inputs:" error.

ciyongch commented 5 years ago

@Amagong Still the same problem as original one. There's some layer has 0 dimension shape input, which is currently not supported by quantized FullyConnected operator. Please check your current model, and exclude all of this layer, I guess these layers are all comes from first time step. We're going to enhance the error message to help understand which operator is reporting this error.

anirudh2290 commented 5 years ago

can you try

q_symbol.infer_shape_partial(data=(900, 137, 9), category=(900, 2))

First list should be correspsonding to q_symbol.list_arguments(), Second list should be corresponding to q_symbol.list_outputs(), third should be q_symbol.list_auxiliary_states(). This should indicate which shape is missing.

Amagong commented 5 years ago

@ciyongch @anirudh2290 Thank you for your reply. @anirudh2290 'infer_shape_partial' works well without error, but still could not bind.

The error above was due to the use of fused-rnn. Below is the simple reproduce code.

import math
import mxnet as mx
from mxnet.contrib.quantization import *

channel_num = 10
conv_layer_filter_dims = [2, 3]
conv_layer_strides = [1,1]
dimension = 5

data_len = 10

data = mx.sym.Variable('data')
label = mx.sym.Variable('label')

# layer stacking
net = mx.sym.Reshape(data=data, shape=(-4, -1, 1, 0, 0))

net = mx.sym.Convolution(data=net,
                         num_filter=channel_num,
                         kernel=tuple(conv_layer_filter_dims),
                         stride=tuple(conv_layer_strides),
                         weight=None,
                         bias=None,
                         no_bias=True,
                         cudnn_tune="fastest",
                         name="conv0")

net = mx.sym.BatchNorm(data=net,
                       eps=0.001,
                       momentum=0.9,
                       fix_gamma=False,
                       use_global_stats=False,
                       output_mean_var=False,
                       name="conv0_batchnorm"
                       )

data_lengths_references = int(math.floor((data_len - conv_layer_filter_dims[0]) / conv_layer_strides[0])) + 1

net = mx.sym.transpose(data=net, axes=(2, 0, 1, 3))
net = mx.sym.Reshape(data=net, shape=(0, 0, -3))

# Fused rnn :
stack = mx.rnn.FusedRNNCell(1024, num_layers=2, mode='rnn_relu', prefix='%s_l0' % ('gru'), bidirectional=False).unfuse()

# lstm : 
'''
stack = mx.rnn.SequentialRNNCell()
cell = mx.rnn.LSTMCell(num_hidden=1760, prefix='%s_l0l0_' % ('gru'))
stack.add(cell)
'''

# gru :
'''
stack = mx.rnn.SequentialRNNCell()
cell = mx.rnn.GRUCell(num_hidden=1760, prefix='%s_l0l0_' % ('gru'))
stack.add(cell)
'''

net, _ = stack.unroll(length=data_lengths_references,
                      inputs=net,
                      merge_outputs=False,
                      layout='TNC'
                     )

net = net[data_lengths_references-1]

net = mx.sym.FullyConnected(data=net, num_hidden=10, no_bias=False, name="classification_fc_layer")

net = mx.sym.SoftmaxOutput(data=net, label=label)

mod = net.simple_bind(ctx=mx.cpu(0), data=(75, data_len, dimension))

# convert to quantize model
net = net.get_backend_symbol('MKLDNN')
net = net.get_backend_symbol('MKLDNN_FC')

excluded_sym_names = []
excluded_sym_names += ['conv0']

arg_dict = mod.arg_dict
aux_dict = mod.aux_dict

arg_params = {}
aux_params = {}

for k, v in arg_dict.items():
        arg_params[k] = v
for k, v in aux_dict.items():
        aux_params[k] = v

qnet, qarg_params, qaux_params = quantize_model(sym=net, arg_params=arg_params, aux_params=aux_params,
         excluded_sym_names=excluded_sym_names, ctx=mx.cpu(0), calib_mode='none', quantized_dtype='uint8')

qnet = qnet.get_backend_symbol('MKLDNN_POST_QUANTIZE')
qnet = qnet.get_backend_symbol('MKLDNN_POST_FC_QUANTIZE')

print(qnet.infer_shape(data=(75, data_len, dimension)))
qnet.simple_bind(ctx=mx.cpu(0), data=(75, data_len, dimension))

When the comment is removed for the lstm or fused-rnn block, the following UserWarning is occurs.

UserWarning: Cannot decide shape for the following arguments (0s in shape means unknown dimensions). Consider providing them as input:

And, in bind time, the following error occurs.

InferShape pass cannot decide shapes for the following arguments (0s means unknown dimensions). Please consider providing them as inputs:

Amagong commented 5 years ago

The same error occurs when using RNNCell. There is no error when using GRUCell. Is there a problem with how to use it?

stu1130 commented 5 years ago

@anirudh2290 and I tried to debug the issue. When I see the shape of gru_l0l0_begin_state_0 in the graph is (0,1024) and following by quantized_fully_connected, the zero dimension of gru_1010 is not been inferred and we need to dive deeper

ciyongch commented 5 years ago

@Amagong Excluding the layers with 0 dimension input will resolve this error. In your samples, the input to h2h in the first timestep (0) of all the layers contains 0 shape, just exclude these layers as below: For Fused-rnn block:

 excluded_sym_names += ['conv0']
+excluded_sym_names += [
+  'gru_l0l0_t0_h2h',
+  'gru_l0l1_t0_h2h',
+  ]

For lstm block:

 excluded_sym_names += ['conv0']
+excluded_sym_names += [
+  'gru_l0l0_t0_h2h',
+  ]

For gru block, I noticed that there's another '_' in gru naming:

 excluded_sym_names += ['conv0']
+excluded_sym_names += [
+  'gru_l0l0_t0__h2h',
+  ]

Beside that, please change simple_bind() to bind() since quantized symbol requires quantized_params (int8). while simple_bind() will allocated default params which is in fp32.

-qnet.simple_bind(ctx=mx.cpu(0), data=(75, data_len, dimension))
+mod = mx.mod.Module(symbol=qnet, context=mx.cpu(0), label_names=None)
+mod.bind(data_shapes=[('data', (75, data_len, dimension))], grad_req='null')
+mod.set_params(qarg_params, qaux_params)

Hope this will help your to enable the case :)

anirudh2290 commented 5 years ago

Thanks @ciyongch . Can you please let me know why quantized_fully_connected doesn't handle inferring the data dimension 0 based on the output shape. For example, the following runs fine on fp32:

import mxnet as mx
qdtype="float32"
num_hidden=100
no_bias=False
flatten=True
x = mx.sym.var("x", dtype=qdtype)
qdata = mx.sym.Variable(name='qdata')#, shape=data_shape, dtype=qdtype)
qbias = mx.sym.Variable(name='qbias')#, shape=(10, 100), dtype=qdtype)
y = mx.sym.exp(x)
fc_fp32 = mx.sym.FullyConnected(data=qdata, num_hidden=num_hidden, no_bias=no_bias, flatten=flatten)
sum_first = mx.sym.elemwise_add(y, fc_fp32)
sum_first_1 = mx.sym.Group([sum_first, x, y])
ex = sum_first_1.simple_bind(mx.cpu(), qdata=(0, 1024), fullconnected0_weight=(100, 1024), fullyconnected0_bias=(100,), x=(10, 100))
print(ex.arg_dict["qdata"].shape)

Expectation is after quantization also it should run fine. But it fails at this check. Is there any reason why we cant remove the check here: https://github.com/apache/incubator-mxnet/blob/master/src/operator/quantization/quantized_fully_connected.cc#L50 and add a inference from output to input like in non quantized fully connected here: https://github.com/apache/incubator-mxnet/blob/master/src/operator/nn/fully_connected.cc#L78

ciyongch commented 5 years ago

@anirudh2290 The behavior was not changed since the initial version, looks like it will throw many errors in rnn domain. Will figure out the reason and see how to improve this :)

Amagong commented 5 years ago

Thanks @ciyongch I follow your guide, no more errors occur. But... There are still some problems in successfully applying Quantization to my code. I'll try various ways to apply to my code. Thank you.

ciyongch commented 5 years ago

@Amagong Glad to here you're able to run quantization on the sample code. Please let us know if you met other errors/failures in your real case. We're working on enhancement for this limitation..

Amagong commented 5 years ago

@ciyongch In my case, there is a problem that inference time is slow when using quantization. Originally it took 2 minutes 40 seconds, it takes 24 minutes after quantization....

I generate a network like the sample code above and use the 'quantize_model' function.

# generate symbol
net = gen_sym(data_len)

net = net.get_backend_symbol('MKLDNN')
net = net.get_backend_symbol('MKLDNN_FC')

excluded_sym_names = []

excluded_sym_names += ['conv0']
excluded_sym_names += ['gru_l0l0_t0_h2h']
excluded_sym_names += ['gru_l0l1_t0_h2h']

save_dict = mx.nd.load('original_model.params')

arg_params = {}
aux_params = {}

for k, v in save_dict.items():
    tp, name = k.split(':', 1)
    if tp == 'arg':
        arg_params[name] = v
    if tp == 'aux':
        aux_params[name] = v

qnet, qarg_params, qaux_params = quantize_model(sym=net, arg_params=arg_params, aux_params=aux_params,
 excluded_sym_names=excluded_sym_names, ctx=mx.cpu(0), calib_mode='none', quantized_dtype='uint8')

qnet = qnet.get_backend_symbol('MKLDNN_POST_QUANTIZE')
qnet = qnet.get_backend_symbol('MKLDNN_POST_FC_QUANTIZE')

return qnet

And set parameters as below.

_, arg_params, aux_params = mx.model.load_checkpoint('quantizede model path', model_epoch_num)
model.set_params(arg_params, aux_params)

I use this structure because input data length is variable.

When I run the inference code like above, it runs without any problem, but it is too slow.... I'm looking for a problem with the my code. Can I get some advice..?

Amagong commented 5 years ago

I'm using 'FusedRNNCell'

ZhennanQin commented 5 years ago

@Amagong The main reason is, you're using quantized model without calibration information. This will result in online calibration and will slow down the performance dramatically. To get full speed of quantized model, we suggest to adopt any of calib_mode(naive or entropy).

Amagong commented 5 years ago

@ZhennanQin Thank you for your advice! I'll try the way you told me.

pengzhao-intel commented 5 years ago

@Amagong @Soonhwan-Kwon did you get the expected results? We'd like to know some feedbacks and continuously improve the INT8 flow and quality :)

pengzhao-intel commented 5 years ago

PR #15031 will fix this issue

pengzhao-intel commented 5 years ago

Closing the issue since the PR is merged. Feel free to reopen if you see the issue again.