NVIDIA / TensorRT

NVIDIA® TensorRT™ is an SDK for high-performance deep learning inference on NVIDIA GPUs. This repository contains the open source components of TensorRT.
https://developer.nvidia.com/tensorrt
Apache License 2.0
10.7k stars 2.12k forks source link

How to get the dynamic binding dimension of an ITensor when creating an INetworkDefinition. #3426

Open osirisQdt opened 11 months ago

osirisQdt commented 11 months ago

Hi everyone,

I'm trying to construct from scratch a network customizing from Faster-RCNN architecture. The raw input image could have variant/dynamic shape, so after some preprocessing step (like resize + padding + normalize), the input network shape also has dynamic shape. So I will try to add input by using:

nvinfer1::ITensor* input = network->addInput("image", nvinfer1::DataType::kFLOAT, nvinfer1::Dims4(-1, 3, -1, -1));

But my problem is, in some next layer/blocks, I need to restore ROIs to the original image's coordinate (like Decoding bounding box step), and to do that, I need some parameter from preprocessing step like (scale = inputNetworkShape + paddingShape/ rawNetworkShape, ...) and this param is also dynamic. But in the building INetworkDefinition phase, I could not use IExecutionContext to run getBindingDimensions because the context hasn't actually existed yet. So I wanna ask some advices, how could I solve this problem?

Thanks for reading my post. It's my pleasure to hear response from you guys.

zerollzeng commented 11 months ago

Use IShapeLayer?

zerollzeng commented 11 months ago

TRT's workflow is:

  1. Create the network with dynamic axis.
  2. Specify dynamics shapes profile
  3. Build the engine.
osirisQdt commented 11 months ago

Hi @zerollzeng, thanks for replying me and sorry for late response. Could you explain more detail by using some pseudo code? I'm struggling to deal with create network with dynamic axis, like Resizing, Slice... so it would be very great if you could give me some samples that using dynamic batch size and dynamic input shape, and some samples about dynamic shape in plugin. Thank you.

osirisQdt commented 11 months ago

Hi @zerollzeng, I'm trying to do some examples about IShapeLayer as you said:

        std::vector<int> c1Vec{1, 1, 3, 2};
        ITensor* input = network->addInput("image", DataType::kFLOAT, Dims4(-1, 1, -1, -1));
        IShapeLayer* shapeLayer = network->addShape(*input);
        Dims dim;
        dim.nbDims = 1;
        dim.d[0] = 4;
        IConstantLayer* c1 = network->addConstant(dim, Weights{DataType::kINT32, c1Vec.data(), 4});      
        IElementWiseLayer* e1 = network->addElementWise(*c1->getOutput(0), *shapeLayer->getOutput(0), ElementWiseOperation::kDIV);
        e1->setOutputType(0, DataType::kFLOAT);

Here, I'm trying to simulate doing to calculate scale (store in e1), by adding an IConstantLayer followed by IElementWiseLayer::kDIV. But the result e1 would be a kINT32 datatype, because I'm doing the division between 2 kINT32 datatypes, so is there anyway to obtain the float type? My pipeline is as follows: (inputH, inputW) -> scale = min(inputH/ A, inputW/ B) -> resize(round(scale A), round(scale B)) -> ... -> DecodePlugin (need scale param)`

osirisQdt commented 11 months ago

I success to build a simple resize, but got a strange error. I have a function to call resize:

    ITensor* resize(INetworkDefinition* network, ITensor* tensor, DimsHW targetSize, int interpolation, bool channelFirst){
        ITensor* shape = network->addShape(*tensor)->getOutput(0);
        std::vector<int> alphaCoeff= channelFirst ? std::vector<int>{1, 1, 0, 0} : std::vector<int>{1, 0, 0, 1};
        std::vector<int> betaCoeff = channelFirst ? std::vector<int>{0, 0, targetSize.h(), targetSize.w()} : std::vector<int>{1, targetSize.h(), targetSize.w(), 1};
        IConstantLayer* alpha = network->addConstant(Dims{1, {alphaCoeff.size(), }}, Weights{DataType::kINT32, alphaCoeff.data(), alphaCoeff.size()});
        IConstantLayer* beta = network->addConstant(Dims{1, {betaCoeff.size(), }}, Weights{DataType::kINT32, betaCoeff.data(), betaCoeff.size()});
        shape = network->addElementWise(*shape, *alpha->getOutput(0), ElementWiseOperation::kPROD)->getOutput(0);
        shape = network->addElementWise(*shape, *beta->getOutput(0), ElementWiseOperation::kSUM)->getOutput(0);

        IResizeLayer* rsz = network->addResize(*tensor);
        rsz->setResizeMode(static_cast<ResizeMode>(interpolation));
        rsz->setInput(1, *shape);
        return rsz->getOutput(0);
    };

And a function buildModel to call it:

        ITensor* input = network->addInput("image", DataType::kFLOAT, Dims4(-1, 3, -1, -1));
        ITensor* rszLayer = this->resize(network, input, DimsHW{1, 2}, 0, true);
        network->markOutput(*rszLayer);

But when I parse this network, the error occurs Error Code 9: Internal Error ((Unnamed Layer* 5) [Resize]_output: dimension 0 never exceeds -776658320). Eventhough, I try to put the content of function resize directly in buildModel, no error occurs and it totally works. When am I missing?

osirisQdt commented 11 months ago

@ttyio @zerollzeng could you help me solve this error please? I wanna wrap some dynamic operators in a function because I use them a lot in my custom network.

ttyio commented 11 months ago

@osirisQdt , seems your complex symbolic expression triggered some bug in TensorRT. Could you try using the ISliceLayer to get the channel dim, and use IConcatLayer to combine the channel with other constant values -1, 1 and 2. This is how I usually see pytorch export onnx with shape tensor. Thanks!

osirisQdt2810 commented 10 months ago

@osirisQdt , seems your complex symbolic expression triggered some bug in TensorRT. Could you try using the ISliceLayer to get the channel dim, and use IConcatLayer to combine the channel with other constant values -1, 1 and 2. This is how I usually see pytorch export onnx with shape tensor. Thanks!

Hi @ttyio , sorry for late response because I have lost my email so must create new account. But I quite don't understand your mean. My intention is doing resize function so I'm not clear about using ISliceLayer and IConcat here. Could you explain more clarify? Thanks.

osirisQdt2810 commented 10 months ago

@osirisQdt , seems your complex symbolic expression triggered some bug in TensorRT. Could you try using the ISliceLayer to get the channel dim, and use IConcatLayer to combine the channel with other constant values -1, 1 and 2. This is how I usually see pytorch export onnx with shape tensor. Thanks!

And I have just learned about creating network from scratch, so I dont understand your words This is how I usually see pytorch export onnx with shape tensor . I thought the tensorrt building is different with onnx? and why you can see the onnx exporting to determine the expression/operation which should be use?. Thanks

ttyio commented 10 months ago

@osirisQdt2810 pseudo code like this:

  channel_dim = network.add_slice(input_shape, [1,], [1,], [1,])
  batch_dim = network.add_constant((1,), trt.Weights(np.ascontiguousarray([-1], dtype=np.int32))
  spatial_dim = network.add_constant((2,), trt.Weights(np.ascontiguousarray([1, 2], dtype=np.int32))
  full_dim = network.add_concat([batch_dim.get_output(0), channel_dim .get_output(0), spatial_dim.get_output(0)])
  resize_layer.set_input(1, full_dim.get_output(0))

Thanks!

osirisQdt commented 10 months ago

Thank you so much @ttyio. I'll try this way.

osirisQdt2810 commented 10 months ago

@osirisQdt2810 pseudo code like this:

  channel_dim = network.add_slice(input_shape, [1,], [1,], [1,])
  batch_dim = network.add_constant((1,), trt.Weights(np.ascontiguousarray([-1], dtype=np.int32))
  spatial_dim = network.add_constant((2,), trt.Weights(np.ascontiguousarray([1, 2], dtype=np.int32))
  full_dim = network.add_concat([batch_dim.get_output(0), channel_dim .get_output(0), spatial_dim.get_output(0)])
  resize_layer.set_input(1, full_dim.get_output(0))

Thanks!

This seem not working @ttyio , I still get an error Error Code 9: Internal Error ((Unnamed Layer* 5) [Resize]_output: dimension 0 never exceeds ... But I think the problem it's not bycomplex expression. When I write a very simple slice function:

    ITensor* DynamicInputShapeSample::slice(INetworkDefinition* network, ITensor* tensor){
        auto slice = network->addSlice(*tensor, Dims4{1, 1, 1, 1}, Dims4{1, 1, 1, 1}, Dims4{1, 1, 1, 1});
        auto shape = network->addShape(*tensor)->getOutput(0);

        std::vector<int32_t> startCoeff0 {0, 0, 0, 0};
        auto start = network->addConstant(Dims{1, 4}, Weights{DataType::kINT32, startCoeff0.data(), static_cast<int64_t>(startCoeff0.size())})->getOutput(0);
        auto startShape = network->addElementWise(*shape, *start, ElementWiseOperation::kPROD)->getOutput(0);
        slice->setInput(1, *startShape);
        return slice->getOutput(0);
    };

Using this function by ITensor* slice = slice(network, input), an error occur Internal Error ((Unnamed Layer* 0) [Slice]: ISliceLayer has out of bounds access on axis 1). But again, if I put all the content of this function to the main call build network, it works fine and no problem occur. I think it's really weird, and very annoying because these simple dynamic function could be called many times. And I don't understand why using the API in python also works fine in defining in def function, but C++ seems not right.

osirisQdt2810 commented 10 months ago

Could you review my complete pipeline of building engine @ttyio? Is there any mistakes in somewhere?

        IBuilder* builder = createInferBuilder(mLogger);
        IBuilderConfig* config = builder->createBuilderConfig();
        INetworkDefinition* network = builder->createNetworkV2(1U);
        // ----------------------------- 2. 输入,模型结构和输出的基本信息 -----------------------------
        ITensor* input = network->addInput("image", DataType::kFLOAT, Dims4(-1, 3, -1, -1));
        ITensor* output = slice(network, input);
        network->markOutput(*output);
        config->setMaxWorkspaceSize(1 << 28);
        auto profile = builder->createOptimizationProfile();
        profile->setDimensions(input->getName(), OptProfileSelector::kMIN, Dims4(1, 3, 4, 1));
        profile->setDimensions(input->getName(), OptProfileSelector::kOPT, Dims4(1, 3, 5, 1));
        profile->setDimensions(input->getName(), OptProfileSelector::kMAX, Dims4(10, 3, 20, 20));
        config->addOptimizationProfile(profile);
        config->setFlag(BuilderFlag::kFP16);

        ILOG_EVENT_DEBUG("********************************************************************************************************")
        ILOG_EVENT_DEBUG("******************************************** BUILD ENGINE ***********************************************")
        ILOG_EVENT_DEBUG("********************************************************************************************************")
        ICudaEngine* mEngine = builder->buildEngineWithConfig(*network, *config);
        if(mEngine == nullptr){
            printf("Build engine failed.\n");
            return false;
        }