Zhen-Dong / HAWQ

Quantization library for PyTorch. Support low-precision and mixed-precision quantization, with hardware implementation through TVM.
MIT License
406 stars 83 forks source link

Could HAWQ's result be imported into tensorRT? #3

Closed leiwen83 closed 2 years ago

leiwen83 commented 3 years ago

Hi,

Have we compared the inference speed with TVM result with tensorrt peer? Since we know tensorrt's cnn could reach hw's peek speed.

Thx, Lei

zachzzc commented 3 years ago

Hi Lei,

Our implementation in TVM is to show the overall speed-up and accuracy we can get. The overa speed for int8 is still slower than tensorrt because the TVM tensorcore convolution schedule is suboptimal. Currently int4 convolution is not available in tensorrt and it’s not open sourced. That’s why we choose TVM to see the speed up of int8 vs int4. When int4 is available in tensorrt in the future, we will import the trained weights and benchmark it.

Thanks, Zach

leiwen83 commented 3 years ago

I see. I could help the porting int8 over tensorrt as the first try. But when I try to do the tvm inference, I met the error. Which version of tvm are you using for running the benchmark?

python hawq_utils_resnet50.py --model-dir data/
Traceback (most recent call last):
  File "hawq_utils_resnet50.py", line 9, in <module>
    from mixed_precision_models.layers import QConfig, QuantizeContext
  File "/data/dev/quant/hawq/tvm_benchmark/mixed_precision_models/__init__.py", line 1, in <module>
    from . import layers
  File "/data/dev/quant/hawq/tvm_benchmark/mixed_precision_models/layers.py", line 12, in <module>
    defaults=('int32', 65.0, 0.0, 'int8', 8.0, 0.0, 'int8', 8.0, 0.0, 'int32', 74.0, 0.0))
TypeError: namedtuple() got an unexpected keyword argument 'defaults'
zachzzc commented 3 years ago

Hi Lei,

what's the python version you are using? It looks like a namedtuple python data declaration mismatch. I am using Python 3.7.4 and don't see this error.

Zach

leiwen83 commented 3 years ago

Hi Zach,

I switch to python3.7, but meet new error:

(512, 256, 1, 1) module.stage4.unit2.quant_convbn1.weight_integer (512, 512, 3, 3) module.stage4.unit2.quant_convbn2.weight_integer (512, 512, 3, 3) module.quant_output.weight_integer (1000, 512) Traceback (most recent call last):

File "hawq_utils_resnet50.py", line 499, in save_weights(save_path, kernel_dtype, num_stages, units)

File "hawq_utils_resnet50.py", line 136, in save_weights renamed_params['conv0_weight'] = params['module.quant_init_convbn.weight_integer']

KeyError: 'module.quant_init_convbn.weight_integer'

zachzzc commented 3 years ago

which weight checkpoint you are using ?

leiwen83 commented 3 years ago

I am using the checkpoint that created by local training. After downloading the checkpoint from modelzoo, it seem works now.

However there is still some problem in inference:

File "test_resnet_inference.py", line 23, in import hawq_utils

ModuleNotFoundError: No module named 'hawq_utils'

I haven't found any module contained in this repo, does this module come from other git?

zachzzc commented 3 years ago

I've updated the test_resnet_inference.py, please pull the updates and try again

leiwen83 commented 3 years ago

Get new error...

python3.7 test_resnet_inference_time.py 
Traceback (most recent call last):

  File "test_resnet_inference_time.py", line 178, in <module>
    debug_unit=args.debug_unit)

  File "/data/dev/quant/hawq/tvm_benchmark/mixed_precision_models/quantized_resnet_v1.py", line 614, in get_workload
    **kwargs)

  File "/data/dev/quant/hawq/tvm_benchmark/mixed_precision_models/quantized_resnet_v1.py", line 557, in get_net
    with_softmax=with_softmax)

  File "/data/dev/quant/hawq/tvm_benchmark/mixed_precision_models/quantized_resnet_v1.py", line 362, in qnn_resnet_v1
    data_layout=_data_layout, kernel_layout=kernel_layout)

  File "/data/dev/quant/hawq/tvm_benchmark/mixed_precision_models/layers.py", line 122, in quantized_conv2d
    kernel_size=kernel_size, channels=output_channels, data_layout=data_layout, kernel_layout=kernel_layout, strides=strides, padding=padding, **kwargs)

  File "/data/dev/inference/tvm/python/tvm/relay/qnn/op/qnn.py", line 278, in conv2d
    data_layout, kernel_layout, out_layout, out_dtype)

  File "/data/dev/inference/tvm/python/tvm/_ffi/_ctypes/function.py", line 207, in __call__
    raise get_last_ffi_error()

tvm._ffi.base.TVMError: Traceback (most recent call last):
  [bt] (4) /data/dev/inference/tvm/build/libtvm.so(TVMFuncCall+0x61) [0x7f517f78ea51]
  [bt] (3) /data/dev/inference/tvm/build/libtvm.so(std::_Function_handler<void (tvm::runtime::TVMArgs, tvm::runtime::TVMRetValue*), void tvm::runtime::TypedPackedFunc<tvm::relay::Expr (tvm::relay::Expr, tvm::relay::Expr, int, int, double, double, tvm::Array<tvm::Expr, void>, tvm::Array<tvm::Expr, void>, tvm::Array<tvm::Expr, void>, int, tvm::Expr, tvm::Array<tvm::Expr, void>, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, tvm::DataType)>::AssignTypedLambda<tvm::relay::Expr (*)(tvm::relay::Expr, tvm::relay::Expr, int, int, double, double, tvm::Array<tvm::Expr, void>, tvm::Array<tvm::Expr, void>, tvm::Array<tvm::Expr, void>, int, tvm::Expr, tvm::Array<tvm::Expr, void>, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, tvm::DataType)>(tvm::relay::Expr (*)(tvm::relay::Expr, tvm::relay::Expr, int, int, double, double, tvm::Array<tvm::Expr, void>, tvm::Array<tvm::Expr, void>, tvm::Array<tvm::Expr, void>, int, tvm::Expr, tvm::Array<tvm::Expr, void>, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, tvm::DataType))::{lambda(tvm::runtime::TVMArgs const&, tvm::runtime::TVMRetValue*)#1}>::_M_invoke(std::_Any_data const&, tvm::runtime::TVMArgs&&, tvm::runtime::TVMRetValue*&&)+0x25d) [0x7f517f6efcdd]
  [bt] (2) /data/dev/inference/tvm/build/libtvm.so(void tvm::runtime::detail::unpack_call_dispatcher<tvm::relay::Expr, 0, 16, tvm::relay::Expr (*)(tvm::relay::Expr, tvm::relay::Expr, int, int, double, double, tvm::Array<tvm::Expr, void>, tvm::Array<tvm::Expr, void>, tvm::Array<tvm::Expr, void>, int, tvm::Expr, tvm::Array<tvm::Expr, void>, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, tvm::DataType)>::run<tvm::runtime::TVMArgValue, tvm::runtime::TVMArgValue, tvm::runtime::TVMArgValue, tvm::runtime::TVMArgValue, tvm::runtime::TVMArgValue, tvm::runtime::TVMArgValue, tvm::runtime::TVMArgValue, tvm::runtime::TVMArgValue, tvm::runtime::TVMArgValue, tvm::runtime::TVMArgValue, tvm::runtime::TVMArgValue, tvm::runtime::TVMArgValue, tvm::runtime::TVMArgValue, tvm::runtime::TVMArgValue, tvm::runtime::TVMArgValue, tvm::runtime::TVMArgValue>(tvm::relay::Expr (* const&)(tvm::relay::Expr, tvm::relay::Expr, int, int, double, double, tvm::Array<tvm::Expr, void>, tvm::Array<tvm::Expr, void>, tvm::Array<tvm::Expr, void>, int, tvm::Expr, tvm::Array<tvm::Expr, void>, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, tvm::DataType), tvm::runtime::TVMArgs const&, tvm::runtime::TVMRetValue*, tvm::runtime::TVMArgValue&&, tvm::runtime::TVMArgValue&&, tvm::runtime::TVMArgValue&&, tvm::runtime::TVMArgValue&&, tvm::runtime::TVMArgValue&&, tvm::runtime::TVMArgValue&&, tvm::runtime::TVMArgValue&&, tvm::runtime::TVMArgValue&&, tvm::runtime::TVMArgValue&&, tvm::runtime::TVMArgValue&&, tvm::runtime::TVMArgValue&&, tvm::runtime::TVMArgValue&&, tvm::runtime::TVMArgValue&&, tvm::runtime::TVMArgValue&&, tvm::runtime::TVMArgValue&&, tvm::runtime::TVMArgValue&&)+0x19e) [0x7f517f6ef71e]
  [bt] (1) /data/dev/inference/tvm/build/libtvm.so(tvm::runtime::TVMPODValue_::operator double() const+0x159) [0x7f517eff02d9]
  [bt] (0) /data/dev/inference/tvm/build/libtvm.so(dmlc::LogMessageFatal::~LogMessageFatal()+0x32) [0x7f517ef9fe72]
  File "/data/dev/inference/tvm/include/tvm/runtime/packed_func.h", line 447
TVMError: Check failed: type_code_ == kDLFloat (8 vs. 2) : expected float but get ObjectCell
leiwen83 commented 3 years ago

While I try to directly modify test_resnet_inference.py as you did to test_resnet_inference_time.py. I found that it would report missing ./data/input_image_batch_1.npy which require --with-featuremap be append to hawq_utils_resnet50.py.

But --with-featuremap would require ./data/input_image.pth.tar which is missing from the original modelzoo...

zachzzc commented 3 years ago

Are you using the TVM under HAWQ repo? You need to use that one.

The checkpoint doesn't contain a input image now. If you want to use test_resnet_inference.py, you can create your own image and save as input_image_batch_1.npy. I will check in an image as a demo.

leiwen83 commented 3 years ago

After switch to internal tvm repo, previous error seems goes away, but new one comes... How about create a dockerfile to describe your working environment? Like using nvcr.io/nvidia/pytorch:20.12-py3 or etc as the base image?

...100%, 0.40 MB, 463 KB/s, 0 seconds passed
Traceback (most recent call last):

  File "test_resnet_inference_time.py", line 232, in <module>
    graph, lib, params = relay.build(func, target=TARGET_NAME, params=params)

  File "/data/dev/quant/hawq/tvm/python/tvm/relay/build_module.py", line 251, in build
    graph_json, mod, params = bld_mod.build(mod, target, target_host, params)

  File "/data/dev/quant/hawq/tvm/python/tvm/relay/build_module.py", line 120, in build
    self._build(mod, target, target_host)

  File "/data/dev/quant/hawq/tvm/python/tvm/_ffi/_ctypes/packed_func.py", line 219, in __call__
    raise get_last_ffi_error()

tvm._ffi.base.TVMError: Traceback (most recent call last):
  [bt] (8) /data/dev/quant/hawq/tvm/build/libtvm.so(std::_Function_handler<void (tvm::runtime::TVMArgs, tvm::runtime::TVMRetValue*), tvm::relay::backend::RelayBuildModule::GetFunction(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, tvm::runtime::ObjectPtr<tvm::runtime::Object> const&)::{lambda(tvm::runtime::TVMArgs, tvm::runtime::TVMRetValue*)#3}>::_M_invoke(std::_Any_data const&, tvm::runtime::TVMArgs&&, tvm::runtime::TVMRetValue*&&)+0x17) [0x7f76570d2527]
  [bt] (7) /data/dev/quant/hawq/tvm/build/libtvm.so(tvm::relay::backend::RelayBuildModule::GetFunction(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, tvm::runtime::ObjectPtr<tvm::runtime::Object> const&)::{lambda(tvm::runtime::TVMArgs, tvm::runtime::TVMRetValue*)#3}::operator()(tvm::runtime::TVMArgs, tvm::runtime::TVMRetValue*) const+0x191) [0x7f76570d2431]
  [bt] (6) /data/dev/quant/hawq/tvm/build/libtvm.so(tvm::relay::backend::RelayBuildModule::BuildRelay(tvm::IRModule, std::unordered_map<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, tvm::runtime::NDArray, std::hash<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, std::equal_to<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, std::allocator<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, tvm::runtime::NDArray> > > const&)+0x7b9) [0x7f76570d1849]
  [bt] (5) /data/dev/quant/hawq/tvm/build/libtvm.so(tvm::build(tvm::Map<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, tvm::IRModule, void, void> const&, tvm::Target const&, tvm::BuildConfig const&)+0x4e9) [0x7f7656c668f9]
  [bt] (4) /data/dev/quant/hawq/tvm/build/libtvm.so(tvm::build(tvm::Map<tvm::Target, tvm::IRModule, void, void> const&, tvm::Target const&, tvm::BuildConfig const&)+0x275) [0x7f7656c653c5]
  [bt] (3) /data/dev/quant/hawq/tvm/build/libtvm.so(tvm::codegen::Build(tvm::IRModule, tvm::Target const&)+0x239) [0x7f7656ca8a89]
  [bt] (2) /data/dev/quant/hawq/tvm/build/libtvm.so(void tvm::runtime::detail::unpack_call<tvm::runtime::Module, 2, tvm::runtime::Module (*)(tvm::IRModule, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >)>(tvm::runtime::Module (* const&)(tvm::IRModule, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >), tvm::runtime::TVMArgs const&, tvm::runtime::TVMRetValue*)+0x18b) [0x7f7656ccb7eb]
  [bt] (1) /data/dev/quant/hawq/tvm/build/libtvm.so(tvm::codegen::BuildCUDA(tvm::IRModule, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >)+0xd21) [0x7f7657173011]
  [bt] (0) /data/dev/quant/hawq/tvm/build/libtvm.so(+0xd7e69b) [0x7f76571f369b]
  File "/data/dev/quant/hawq/tvm/python/tvm/_ffi/_ctypes/packed_func.py", line 78, in cfun
    rv = local_pyfunc(*pyargs)
  File "/data/dev/quant/hawq/tvm/python/tvm/autotvm/measure/measure_methods.py", line 599, in tvm_callback_cuda_compile
    ptx = nvcc.compile_cuda(code, target=target, arch=AutotvmGlobalScope.current.cuda_target_arch)
  File "/data/dev/quant/hawq/tvm/python/tvm/contrib/nvcc.py", line 103, in compile_cuda
    raise RuntimeError(msg)
RuntimeError: Compilation error:
/tmp/tmpstr8srtm/my_kernel.cu(18): error: name followed by "::" must be a class or namespace name

/tmp/tmpstr8srtm/my_kernel.cu(19): error: name followed by "::" must be a class or namespace name

/tmp/tmpstr8srtm/my_kernel.cu(38): error: incomplete type is not allowed

/tmp/tmpstr8srtm/my_kernel.cu(40): error: name followed by "::" must be a class or namespace name

/tmp/tmpstr8srtm/my_kernel.cu(40): error: incomplete type is not allowed

/tmp/tmpstr8srtm/my_kernel.cu(41): error: name followed by "::" must be a class or namespace name

/tmp/tmpstr8srtm/my_kernel.cu(41): error: incomplete type is not allowed
zachzzc commented 3 years ago

What you suggest is good. We will create a docker file to make life easier.

I suspect the error is caused by CUDA version. We are using CUDA 10.2. tvm instructions here describes the detailed environment.

haibao-yu commented 3 years ago

Are you using the TVM under HAWQ repo? You need to use that one.

The checkpoint doesn't contain a input image now. If you want to use test_resnet_inference.py, you can create your own image and save as input_image_batch_1.npy. I will check in an image as a demo.

Hi, what's the processing of the input image? I see that there is normalizing in quant_train.py to preprocess the input image

haibao-yu commented 3 years ago

Hi, there is error when running "python hawq_utils_resnet50.py --model-dir ./data/resnet50_uniform8 --with-featuremap"

dict_keys(['convbn_scaling_factor', 'fc_scaling_factor', 'weight_integer', 'bias_integer', 'act_scaling_factor'])
(886, 1604, 3)
Traceback (most recent call last):

  File "hawq_utils_resnet50.py", line 494, in <module>
    feature_map = torch.load(featuremap_name)['featuremap']

  File "/home/yuhaibao94/anaconda3/lib/python3.7/site-packages/torch/serialization.py", line 584, in load
    with _open_file_like(f, 'rb') as opened_file:

  File "/home/yuhaibao94/anaconda3/lib/python3.7/site-packages/torch/serialization.py", line 234, in _open_file_like
    return _open_file(name_or_buffer, mode)

  File "/home/yuhaibao94/anaconda3/lib/python3.7/site-packages/torch/serialization.py", line 215, in __init__
    super(_open_file, self).__init__(open(name, mode))

FileNotFoundError: [Errno 2] No such file or directory: './data/resnet50_uniform8/featuremaps.pth.tar'
haibao-yu commented 3 years ago

I am using the checkpoint that created by local training. After downloading the checkpoint from modelzoo, it seem works now.

However there is still some problem in inference:

File "test_resnet_inference.py", line 23, in import hawq_utils

ModuleNotFoundError: No module named 'hawq_utils'

I haven't found any module contained in this repo, does this module come from other git?

I think we should modify the "import hawq_utils" to "import hawq_utils_resnet50 as hawq_utils"

zachzzc commented 3 years ago

Are you using the TVM under HAWQ repo? You need to use that one. The checkpoint doesn't contain a input image now. If you want to use test_resnet_inference.py, you can create your own image and save as input_image_batch_1.npy. I will check in an image as a demo.

Hi, what's the processing of the input image? I see that there is normalizing in quant_train.py to preprocess the input image

Right, the input image needs to be pre-processed and save as input_image_batch_1.npy

zachzzc commented 3 years ago

Hi, there is error when running "python hawq_utils_resnet50.py --model-dir ./data/resnet50_uniform8 --with-featuremap"

dict_keys(['convbn_scaling_factor', 'fc_scaling_factor', 'weight_integer', 'bias_integer', 'act_scaling_factor'])
(886, 1604, 3)
Traceback (most recent call last):

  File "hawq_utils_resnet50.py", line 494, in <module>
    feature_map = torch.load(featuremap_name)['featuremap']

  File "/home/yuhaibao94/anaconda3/lib/python3.7/site-packages/torch/serialization.py", line 584, in load
    with _open_file_like(f, 'rb') as opened_file:

  File "/home/yuhaibao94/anaconda3/lib/python3.7/site-packages/torch/serialization.py", line 234, in _open_file_like
    return _open_file(name_or_buffer, mode)

  File "/home/yuhaibao94/anaconda3/lib/python3.7/site-packages/torch/serialization.py", line 215, in __init__
    super(_open_file, self).__init__(open(name, mode))

FileNotFoundError: [Errno 2] No such file or directory: './data/resnet50_uniform8/featuremaps.pth.tar'

In this checkpoint we didn't save intermediate feature maps