rmccorm4 / tensorrt-utils

⚡ Useful scripts when using TensorRT
Apache License 2.0
237 stars 55 forks source link

Generating int8 Calibration File for ResNet-18 Caffe Model #18

Open Hassan313 opened 4 years ago

Hassan313 commented 4 years ago

Hi Ryan,

I am trying to create a calibration file for the ResNet-18 Caffe model. You have mentioned the below statement in another issue:

I have created a reference for INT8 calibration on Imagenet-like data. Hopefully you can use this as a starting point.

However, I do not know how to continue. Since this is different than the sample.py and calibrator.py in the TensorRT 7.0 repository ( tensorrt/samples/python/int8_caffe_mnist/).

Note: I am working on the NVDLA accelerator and unfortunately the compiler of this accelerator only uses Caffe models. They have stated that they are going to add an ONNX model for the future release, hence I have no choice to work on the Caffe models until they add the ONNX feature to the compiler.

Thank you very much.

rmccorm4 commented 4 years ago

Hi @Hassan313 ,

If you change the ONNX parser code to use Caffe parser instead:

I believe you should be able to use the INT8 code as is. I'm assuming Resnet18 has the same input/output shapes as other imagenet models.

If you have any specific problems or errors, please share them.

Hassan313 commented 4 years ago

Hi @rmccorm4 ,

Thank you for your response.

I am trying to the part 3 as below:

image

When I try to run the command, I get the below error:

ERROR: Failed to parse the ONNX file: resnet50/model.onnx In node -1 (importModel): INVALID_VALUE: Assertion failed: !_importer_ctx.network()->hasImplicitBatchDimension() && "This version of the ONNX parser only supports TensorRT INetworkDefinitions with an explicit batch dimension. Please ensure the network was created using the EXPLICIT_BATCH NetworkDefinitionCreationFlag."

What should I do to eliminate the problem?

Thank you very much.

rmccorm4 commented 4 years ago

Hi @Hassan313 ,

Per the error, it looks like you're still using the ONNX parser. Did you change the code to use the Caffe parser as suggested above?

Hassan313 commented 4 years ago

Hi @rmccorm4 ,

I am first trying to see if I can run the original code. I have not changed the ONNX parser to Caffe parser.

Can you kindly help with this error first? Or you suggest I directly go to changing the parser?

Thank you very much.

rmccorm4 commented 4 years ago

If you're actually using an ONNX model and are using TensorRT >= 7.0, you'll need to add the --explicit-batch flag when running the script.

The README instructions are a little outdated. (based on the 19.10 container which was TensorRT 6 and did not have this restriction).

Hassan313 commented 4 years ago

@rmccorm4 That solved the problem. I can successfully run the below:

image

rmccorm4 commented 4 years ago

FYI that infer_tensorrt script is definitely out of date for TensorRT 7 ONNX models. Since the EXPLICIT_BATCH flag is used, the batch size dimension of the original ONNX model will be used. It will likely provide garbage data for batch sizes different than the one in the ONNX model.

However, if you move forward with a Caffe model (which doesn't support explicit batch, and instead uses implicit batch), then that infer_tensorrt script will likely work as intended for various batch sizes.

Hassan313 commented 4 years ago

@rmccorm4 Hi Ryan,

I have changed the ONNX parser to Caffe parser. The below code is the part of the code that I have changed:

# Building engine
with trt.Builder(TRT_LOGGER) as builder, \
     builder.create_network(network_flags) as network, \
     builder.create_builder_config() as config, \
     trt.CaffeParser() as parser:

    config.max_workspace_size = 2**30 # 1GiB

    model_tensors = parser.parse(deploy="/home/hassan/tensorrt-utils/classification/imagenet/resnet18/deploy.txt", model="/home/hassan/tensorrt-utils/classification/imagenet/resnet18/resnet-18.caffemodel", network=network, dtype=trt.float32)
    network.mark_output(model_tensors.find("prob")
    # Set Builder Config Flags
    for flag in builder_flag_map:
        if getattr(args, flag):
            logger.info("Setting {}".format(builder_flag_map[flag]))
            config.set_flag(builder_flag_map[flag])

I am getting the below error:

File "onnx_to_tensorrt.py", line 169 for flag in builder_flag_map: ^ SyntaxError: invalid syntax

Can you kindly help?

Thank you very much.

rmccorm4 commented 4 years ago

This is just a python syntax error, not specific to TensorRT.

Looks like you're missing a parentheses at the end of network.mark_output(model_tensors.find("prob"))

Hassan313 commented 4 years ago

@rmccorm4 Thank you for your reply. Sorry, it was my syntax mistake.

Now I am able to run the code, however I am getting the below logs:

image

And the calibration file is not getting created.

Can you kindly help?

Thank you very much.

rmccorm4 commented 4 years ago

I believe the Caffe parser uses a different syntax than the ONNX parser used here: https://www.github.com/rmccorm4/tensorrt-utils/tree/8dcd18c5c88f35bdb04e42e46b46862d81c36230/classification%2Fimagenet%2Fonnx_to_tensorrt.py

Although I notice you have some Caffe parser code in your snippet above, so perhaps you're calling the parser a second time incorrectly in the code block I linked.

Please refer to a Caffe python sample for how to use the Caffe parser.

Hassan313 commented 4 years ago

@rmccorm4 Thank you for your reply.

I am now able to run the code with the Caffe model of ResNet-18. However, there are many "unnamed layers" in the file. How can I fix this problem?

Below is what I am getting:

TRT-7000-EntropyCalibration2 data: 3caa54fc (Unnamed Layer 0) [Convolution]_output: 3d6969df (Unnamed Layer 1) [Scale]_output: 3ac9c5e4 (Unnamed Layer 2) [Scale]_output: 3bb23f5c conv1: 3bb2132a pool1: 3bb2132a (Unnamed Layer 5) [Convolution]_output: 3b0f3403 (Unnamed Layer 6) [Scale]_output: 3c19f4a8 res2a_branch1: 3b39bf9e (Unnamed Layer 8) [Convolution]_output: 3b64b387 (Unnamed Layer 9) [Scale]_output: 3c34ac95 (Unnamed Layer 10) [Scale]_output: 3b309597 res2a_branch2a: 3b21cb45 (Unnamed Layer 12) [Convolution]_output: 3acc6b2e (Unnamed Layer 13) [Scale]_output: 3cb1a784 res2a_branch2b: 3b5eba4f (Unnamed Layer 15) [ElementWise]_output: 3ba75a96 res2a: 3b73bec7 (Unnamed Layer 17) [Convolution]_output: 3b8658ae (Unnamed Layer 18) [Scale]_output: 3cce329f (Unnamed Layer 19) [Scale]_output: 3b9b34cf res2b_branch2a: 3b6795f7 (Unnamed Layer 21) [Convolution]_output: 3b1fb3bc (Unnamed Layer 22) [Scale]_output: 3d1fdc85 res2b_branch2b: 3c0d6bd4 (Unnamed Layer 24) [ElementWise]_output: 3ba9e0a1 res2b: 3ba9e0a1 (Unnamed Layer 26) [Convolution]_output: 3b16170a (Unnamed Layer 27) [Scale]_output: 3cf6451c res3a_branch1: 3b263b66 (Unnamed Layer 29) [Convolution]_output: 3ba9a8f5 (Unnamed Layer 30) [Scale]_output: 3cc67d8a (Unnamed Layer 31) [Scale]_output: 3ba7f0eb res3a_branch2a: 3b86b13f (Unnamed Layer 33) [Convolution]_output: 3ae5e952 (Unnamed Layer 34) [Scale]_output: 3cb6efb3 res3a_branch2b: 3ba33981 (Unnamed Layer 36) [ElementWise]_output: 3bc24e1f res3a: 3ba9509c (Unnamed Layer 38) [Convolution]_output: 3b66809e (Unnamed Layer 39) [Scale]_output: 3ce33610 (Unnamed Layer 40) [Scale]_output: 3b777039 res3b_branch2a: 3b75afc6 (Unnamed Layer 42) [Convolution]_output: 3afb4c74 (Unnamed Layer 43) [Scale]_output: 3d124362 res3b_branch2b: 3baee2bd (Unnamed Layer 45) [ElementWise]_output: 3bda3b92 res3b: 3c232aa1 (Unnamed Layer 47) [Convolution]_output: 3acb6310 (Unnamed Layer 48) [Scale]_output: 3cd0b5d2 res4a_branch1: 3b0e18b5 (Unnamed Layer 50) [Convolution]_output: 3b8f205a (Unnamed Layer 51) [Scale]_output: 3ceb05ce (Unnamed Layer 52) [Scale]_output: 3b92b025 res4a_branch2a: 3b7460fa (Unnamed Layer 54) [Convolution]_output: 3b08f7e1 (Unnamed Layer 55) [Scale]_output: 3cc4e16c res4a_branch2b: 3b7876f9 (Unnamed Layer 57) [ElementWise]_output: 3ba01f39 res4a: 3b9b7d97 (Unnamed Layer 59) [Convolution]_output: 3b621e55 (Unnamed Layer 60) [Scale]_output: 3d01f964 (Unnamed Layer 61) [Scale]_output: 3b8113fa res4b_branch2a: 3b75dd91 (Unnamed Layer 63) [Convolution]_output: 3ad95424 (Unnamed Layer 64) [Scale]_output: 3ce10797 res4b_branch2b: 3b9ab9fb (Unnamed Layer 66) [ElementWise]_output: 3b9f849e res4b: 3b9e4632 (Unnamed Layer 68) [Convolution]_output: 3a41c535 (Unnamed Layer 69) [Scale]_output: 3ccc7fd8 res5a_branch1: 3b3127c4 (Unnamed Layer 71) [Convolution]_output: 3b236e54 (Unnamed Layer 72) [Scale]_output: 3cd32a30 (Unnamed Layer 73) [Scale]_output: 3b91769c res5a_branch2a: 3b365c95 (Unnamed Layer 75) [Convolution]_output: 3b00c450 (Unnamed Layer 76) [Scale]_output: 3c90df16 res5a_branch2b: 3bae948a (Unnamed Layer 78) [ElementWise]_output: 3bd0697d res5a: 3ba310bb (Unnamed Layer 80) [Convolution]_output: 3b0316c2 (Unnamed Layer 81) [Scale]_output: 3cbd74e6 (Unnamed Layer 82) [Scale]_output: 3b425e32 res5b_branch2a: 3b04391a (Unnamed Layer 84) [Convolution]_output: 3a8278a9 (Unnamed Layer 85) [Scale]_output: 3d572bce res5b_branch2b: 3d8a603c (Unnamed Layer* 87) [ElementWise]_output: 3d80bed8 res5b: 3da2229e pool5: 3da2229e fc1000: 3d3e8d71 prob: 392964a1

Moreover, here is the JSON version of the calibration file:

{ "data": { "scale": 0.02079247683286667, "min": 0, "max": 0, "offset": 0 }, "(Unnamed Layer 0) [Convolution]_output": { "scale": 0.05698573216795921, "min": 0, "max": 0, "offset": 0 }, "(Unnamed Layer 1) [Scale]_output": { "scale": 0.0015394059009850025, "min": 0, "max": 0, "offset": 0 }, "(Unnamed Layer 2) [Scale]_output": { "scale": 0.0054396819323301315, "min": 0, "max": 0, "offset": 0 }, "conv1": { "scale": 0.005434413440525532, "min": 0, "max": 0, "offset": 0 }, "pool1": { "scale": 0.005434413440525532, "min": 0, "max": 0, "offset": 0 }, "(Unnamed Layer 5) [Convolution]_output": { "scale": 0.0021851069759577513, "min": 0, "max": 0, "offset": 0 }, "(Unnamed Layer 6) [Scale]_output": { "scale": 0.009396709501743317, "min": 0, "max": 0, "offset": 0 }, "res2a_branch1": { "scale": 0.002834297250956297, "min": 0, "max": 0, "offset": 0 }, "(Unnamed Layer 8) [Convolution]_output": { "scale": 0.003489704569801688, "min": 0, "max": 0, "offset": 0 }, "(Unnamed Layer 9) [Scale]_output": { "scale": 0.011027474887669086, "min": 0, "max": 0, "offset": 0 }, "(Unnamed Layer 10) [Scale]_output": { "scale": 0.002694463124498725, "min": 0, "max": 0, "offset": 0 }, "res2a_branch2a": { "scale": 0.0024687808472663164, "min": 0, "max": 0, "offset": 0 }, "(Unnamed Layer 12) [Convolution]_output": { "scale": 0.0015595906879752874, "min": 0, "max": 0, "offset": 0 }, "(Unnamed Layer 13) [Scale]_output": { "scale": 0.02168632298707962, "min": 0, "max": 0, "offset": 0 }, "res2a_branch2b": { "scale": 0.0033985560294240713, "min": 0, "max": 0, "offset": 0 }, "(Unnamed Layer 15) [ElementWise]_output": { "scale": 0.0051072342321276665, "min": 0, "max": 0, "offset": 0 }, "res2a": { "scale": 0.0037192569579929113, "min": 0, "max": 0, "offset": 0 }, "(Unnamed Layer 17) [Convolution]_output": { "scale": 0.00409992691129446, "min": 0, "max": 0, "offset": 0 }, "(Unnamed Layer 18) [Scale]_output": { "scale": 0.025170622393488884, "min": 0, "max": 0, "offset": 0 }, "(Unnamed Layer 19) [Scale]_output": { "scale": 0.004736519884318113, "min": 0, "max": 0, "offset": 0 }, "res2b_branch2a": { "scale": 0.0035337188746780157, "min": 0, "max": 0, "offset": 0 }, "(Unnamed Layer 21) [Convolution]_output": { "scale": 0.002436860464513302, "min": 0, "max": 0, "offset": 0 }, "(Unnamed Layer 22) [Scale]_output": { "scale": 0.03902866318821907, "min": 0, "max": 0, "offset": 0 }, "res2b_branch2b": { "scale": 0.008631665259599686, "min": 0, "max": 0, "offset": 0 }, "(Unnamed Layer 24) [ElementWise]_output": { "scale": 0.0051842485554516315, "min": 0, "max": 0, "offset": 0 }, "res2b": { "scale": 0.0051842485554516315, "min": 0, "max": 0, "offset": 0 }, "(Unnamed Layer 26) [Convolution]_output": { "scale": 0.0022901915945112705, "min": 0, "max": 0, "offset": 0 }, "(Unnamed Layer 27) [Scale]_output": { "scale": 0.030062250792980194, "min": 0, "max": 0, "offset": 0 }, "res3a_branch1": { "scale": 0.0025364994071424007, "min": 0, "max": 0, "offset": 0 }, "(Unnamed Layer 29) [Convolution]_output": { "scale": 0.005177611950784922, "min": 0, "max": 0, "offset": 0 }, "(Unnamed Layer 30) [Scale]_output": { "scale": 0.024229783564805984, "min": 0, "max": 0, "offset": 0 }, "(Unnamed Layer 31) [Scale]_output": { "scale": 0.005125155206769705, "min": 0, "max": 0, "offset": 0 }, "res3a_branch2a": { "scale": 0.004110484849661589, "min": 0, "max": 0, "offset": 0 }, "(Unnamed Layer 33) [Convolution]_output": { "scale": 0.0017540848348289728, "min": 0, "max": 0, "offset": 0 }, "(Unnamed Layer 34) [Scale]_output": { "scale": 0.02233109436929226, "min": 0, "max": 0, "offset": 0 }, "res3a_branch2b": { "scale": 0.004981220234185457, "min": 0, "max": 0, "offset": 0 }, "(Unnamed Layer 36) [ElementWise]_output": { "scale": 0.0059297229163348675, "min": 0, "max": 0, "offset": 0 }, "res3a": { "scale": 0.0051670800894498825, "min": 0, "max": 0, "offset": 0 }, "(Unnamed Layer 38) [Convolution]_output": { "scale": 0.0035171876661479473, "min": 0, "max": 0, "offset": 0 }, "(Unnamed Layer 39) [Scale]_output": { "scale": 0.027735739946365356, "min": 0, "max": 0, "offset": 0 }, "(Unnamed Layer 40) [Scale]_output": { "scale": 0.003775609889999032, "min": 0, "max": 0, "offset": 0 }, "res3b_branch2a": { "scale": 0.003748880233615637, "min": 0, "max": 0, "offset": 0 }, "(Unnamed Layer 42) [Convolution]_output": { "scale": 0.0019172565080225468, "min": 0, "max": 0, "offset": 0 }, "(Unnamed Layer 43) [Scale]_output": { "scale": 0.03570879250764847, "min": 0, "max": 0, "offset": 0 }, "res3b_branch2b": { "scale": 0.005337087903171778, "min": 0, "max": 0, "offset": 0 }, "(Unnamed Layer 45) [ElementWise]_output": { "scale": 0.006659933365881443, "min": 0, "max": 0, "offset": 0 }, "res3b": { "scale": 0.009958893992006779, "min": 0, "max": 0, "offset": 0 }, "(Unnamed Layer 47) [Convolution]_output": { "scale": 0.001551719382405281, "min": 0, "max": 0, "offset": 0 }, "(Unnamed Layer 48) [Scale]_output": { "scale": 0.025477323681116104, "min": 0, "max": 0, "offset": 0 }, "res4a_branch1": { "scale": 0.0021682207006961107, "min": 0, "max": 0, "offset": 0 }, "(Unnamed Layer 50) [Convolution]_output": { "scale": 0.004367870278656483, "min": 0, "max": 0, "offset": 0 }, "(Unnamed Layer 51) [Scale]_output": { "scale": 0.028689291328191757, "min": 0, "max": 0, "offset": 0 }, "(Unnamed Layer 52) [Scale]_output": { "scale": 0.004476564470678568, "min": 0, "max": 0, "offset": 0 }, "res4a_branch2a": { "scale": 0.003728924784809351, "min": 0, "max": 0, "offset": 0 }, "(Unnamed Layer 54) [Convolution]_output": { "scale": 0.0020899700466543436, "min": 0, "max": 0, "offset": 0 }, "(Unnamed Layer 55) [Scale]_output": { "scale": 0.02403327077627182, "min": 0, "max": 0, "offset": 0 }, "res4a_branch2b": { "scale": 0.003791271010413766, "min": 0, "max": 0, "offset": 0 }, "(Unnamed Layer 57) [ElementWise]_output": { "scale": 0.004886534530669451, "min": 0, "max": 0, "offset": 0 }, "res4a": { "scale": 0.004745196085423231, "min": 0, "max": 0, "offset": 0 }, "(Unnamed Layer 59) [Convolution]_output": { "scale": 0.0034502942580729723, "min": 0, "max": 0, "offset": 0 }, "(Unnamed Layer 60) [Scale]_output": { "scale": 0.031731978058815, "min": 0, "max": 0, "offset": 0 }, "(Unnamed Layer 61) [Scale]_output": { "scale": 0.003939148969948292, "min": 0, "max": 0, "offset": 0 }, "res4b_branch2a": { "scale": 0.003751609707251191, "min": 0, "max": 0, "offset": 0 }, "(Unnamed Layer 63) [Convolution]_output": { "scale": 0.0016580861993134022, "min": 0, "max": 0, "offset": 0 }, "(Unnamed Layer 64) [Scale]_output": { "scale": 0.027469439432024956, "min": 0, "max": 0, "offset": 0 }, "res4b_branch2b": { "scale": 0.004721877630800009, "min": 0, "max": 0, "offset": 0 }, "(Unnamed Layer 66) [ElementWise]_output": { "scale": 0.004868104122579098, "min": 0, "max": 0, "offset": 0 }, "res4b": { "scale": 0.00483014527708292, "min": 0, "max": 0, "offset": 0 }, "(Unnamed Layer 68) [Convolution]_output": { "scale": 0.0007391751860268414, "min": 0, "max": 0, "offset": 0 }, "(Unnamed Layer 69) [Scale]_output": { "scale": 0.02496330440044403, "min": 0, "max": 0, "offset": 0 }, "res5a_branch1": { "scale": 0.0027031758800148964, "min": 0, "max": 0, "offset": 0 }, "(Unnamed Layer 71) [Convolution]_output": { "scale": 0.00249375868588686, "min": 0, "max": 0, "offset": 0 }, "(Unnamed Layer 72) [Scale]_output": { "scale": 0.025776952505111694, "min": 0, "max": 0, "offset": 0 }, "(Unnamed Layer 73) [Scale]_output": { "scale": 0.004439188167452812, "min": 0, "max": 0, "offset": 0 }, "res5a_branch2a": { "scale": 0.0027826179284602404, "min": 0, "max": 0, "offset": 0 }, "(Unnamed Layer 75) [Convolution]_output": { "scale": 0.001964826136827469, "min": 0, "max": 0, "offset": 0 }, "(Unnamed Layer 76) [Scale]_output": { "scale": 0.01768450066447258, "min": 0, "max": 0, "offset": 0 }, "res5a_branch2b": { "scale": 0.005327765829861164, "min": 0, "max": 0, "offset": 0 }, "(Unnamed Layer 78) [ElementWise]_output": { "scale": 0.0063602314330637455, "min": 0, "max": 0, "offset": 0 }, "res5a": { "scale": 0.004976359661668539, "min": 0, "max": 0, "offset": 0 }, "(Unnamed Layer 80) [Convolution]_output": { "scale": 0.0020002578385174274, "min": 0, "max": 0, "offset": 0 }, "(Unnamed Layer 81) [Scale]_output": { "scale": 0.023127030581235886, "min": 0, "max": 0, "offset": 0 }, "(Unnamed Layer 82) [Scale]_output": { "scale": 0.0029658195562660694, "min": 0, "max": 0, "offset": 0 }, "res5b_branch2a": { "scale": 0.0020175636745989323, "min": 0, "max": 0, "offset": 0 }, "(Unnamed Layer 84) [Convolution]_output": { "scale": 0.0009954172419384122, "min": 0, "max": 0, "offset": 0 }, "(Unnamed Layer 85) [Scale]_output": { "scale": 0.05253200978040695, "min": 0, "max": 0, "offset": 0 }, "res5b_branch2b": { "scale": 0.06756636500358582, "min": 0, "max": 0, "offset": 0 }, "(Unnamed Layer* 87) [ElementWise]_output": { "scale": 0.06286400556564331, "min": 0, "max": 0, "offset": 0 }, "res5b": { "scale": 0.07916758954524994, "min": 0, "max": 0, "offset": 0 }, "pool5": { "scale": 0.07916758954524994, "min": 0, "max": 0, "offset": 0 }, "fc1000": { "scale": 0.04652160778641701, "min": 0, "max": 0, "offset": 0 }, "prob": { "scale": 0.0001615458313608542, "min": 0, "max": 0, "offset": 0 } }

Can you kindly help?

Thank you very much.

rmccorm4 commented 4 years ago

I'm not sure if there is an automatic way to fix that (i.e. let parser correctly name all layers). I'm not sure why that happens. It's probably when there's no 1:1 mapping from Caffe layer to TensorRT op, so TensorRT breaks the layer out into several ops or something like that.

You could try to compare the layers with your Caffe layers (programmatically), and set the names of each TensorRT INetworkDefinition layer manually.

Pseudocode:

for i in network.num_layers:
    layer = network.get_layer(i)
    layer.name = ...
Hassan313 commented 4 years ago

@rmccorm4 Hi Ryan,

Thank you very much for your reply. I will try the solution you mentioned. Thank you very much for your prompt replies. I really appreciate your help and support.

Hassan313 commented 4 years ago

I'm not sure if there is an automatic way to fix that (i.e. let parser correctly name all layers). I'm not sure why that happens. It's probably when there's no 1:1 mapping from Caffe layer to TensorRT op, so TensorRT breaks the layer out into several ops or something like that.

You could try to compare the layers with your Caffe layers (programmatically), and set the names of each TensorRT INetworkDefinition layer manually.

Pseudocode:

for i in network.num_layers:
    layer = network.get_layer(i)
    layer.name = ...

You could try to compare the layers with your Caffe layers (programmatically), and set the names of each TensorRT INetworkDefinition layer manually.

The layers sequence in the calibration file is not in the order of the deploy file of the Caffe model. How can I figure out the sequence of layers in the calibration file?

rmccorm4 commented 4 years ago

Hi @Hassan313 ,

Not too sure. You'll probably have to do a bit of investigation to see how the 2 files correlate. Maybe try creating a few caches, see if it's always the same ordering. Try comparing with the layer names in the network object. Verify that the Caffe layers are actually in order or random, etc.