cyrusbehr / tensorrt-cpp-api

TensorRT C++ API Tutorial
MIT License
577 stars 72 forks source link

Segmentation fault (core dumped) #5

Closed lyzKF closed 1 year ago

lyzKF commented 2 years ago

@cyrusbehr Hi, i follow your codes and run the demo, "Segmentation fault (core dumped)" will occur sometimes. here is the demo:

root@f5119fd980c7:/shopeeMT/build# CUDA_VISIBLE_DEVICES=1 ./driver
Searching for engine file with name: trt.engine.fp16
Engine found, not regenerating...
Success! Average time per inference: 0.58 ms, for batch size of: 4
root@f5119fd980c7:/shopeeMT/build# CUDA_VISIBLE_DEVICES=1 ./driver
Searching for engine file with name: trt.engine.fp16
Engine found, not regenerating...
Success! Average time per inference: 0.5475 ms, for batch size of: 4
root@f5119fd980c7:/shopeeMT/build# CUDA_VISIBLE_DEVICES=1 ./driver
Searching for engine file with name: trt.engine.fp16
Engine found, not regenerating...
Segmentation fault (core dumped)
root@f5119fd980c7:/shopeeMT/build# CUDA_VISIBLE_DEVICES=1 ./driver
Searching for engine file with name: trt.engine.fp16
Engine found, not regenerating...
Success! Average time per inference: 0.55 ms, for batch size of: 4
root@f5119fd980c7:/shopeeMT/build# CUDA_VISIBLE_DEVICES=1 ./driver
Searching for engine file with name: trt.engine.fp16
Engine found, not regenerating...
Success! Average time per inference: 0.5525 ms, for batch size of: 4

i debug the program, codes in the file "engine.cpp" can cause this, but i don't know why

    for (size_t batch = 0; batch < inputFaceChips.size(); ++batch) {
        auto image = inputFaceChips[batch];

        // Preprocess code
        image.convertTo(image, CV_32FC3, 1.f / 255.f);
        cv::subtract(image, cv::Scalar(0.5f, 0.5f, 0.5f), image, cv::noArray(), -1);
        cv::divide(image, cv::Scalar(0.5f, 0.5f, 0.5f), image, 1, -1);

        // NHWC to NCHW conversion
        // NHWC: For each pixel, its 3 colors are stored together in RGB order.
        // For a 3 channel image, say RGB, pixels of the R channel are stored first, then the G channel and finally the B channel.
        int offset = dims.d[1] * dims.d[2] * dims.d[3] * batch;
        int r = 0 , g = 0, b = 0;
        for (int i = 0; i < dims.d[1] * dims.d[2] * dims.d[3]; ++i) {
            if (i % 3 == 0) {
                hostDataBuffer[offset + r++] = *(reinterpret_cast<float*>(image.data) + i);
            } else if (i % 3 == 1) {
                hostDataBuffer[offset + g++ + dims.d[2]*dims.d[3]] = *(reinterpret_cast<float*>(image.data) + i);
            } else {
                hostDataBuffer[offset + b++ + dims.d[2]*dims.d[3]*2] = *(reinterpret_cast<float*>(image.data) + i);
            }
        }
    }

looking forward to your reply, thank you

cyrusbehr commented 2 years ago

Can you please provide me with your onnx model so that I can try to recreate the above.

lyzKF commented 2 years ago

@cyrusbehr yep, i download the onnx model through this scripts

#import onnx
import torch
import torchvision
#import netron
torch.set_default_tensor_type('torch.FloatTensor')
torch.set_default_tensor_type('torch.cuda.FloatTensor')

net = torchvision.models.resnet18(pretrained=True).cuda()
net.eval()

export_onnx_file = "./resnet50.onnx"
x=torch.onnx.export(net,  
                torch.randn(1, 3, 224, 224, device='cuda'), 
                export_onnx_file,  
                verbose=False,      
                input_names=["input"],
                output_names=["output"], 
                opset_version=10,   
                do_constant_folding=True, 
                )

export_onnx_file = "./resnet50_dynamic.onnx"
x=torch.onnx.export(net,  
                torch.randn(1, 3, 224, 224, device='cuda'), 
                export_onnx_file, 
                verbose=False,     
                input_names=["input"],
                output_names=["output"], 
                opset_version=10,
                do_constant_folding=True,
                dynamic_axes={"input":{0: "batch_size"}, "output":{0: "batch_size"},} 
                )
LeeYimingAI commented 1 year ago

For me , just add a resize function to resize the image to you input shape fix the problem. cv::resize(image, image, cv::Size(image_size, image_size), cv::INTER_LINEAR);

cyrusbehr commented 1 year ago

Hi, sorry it's taken me so long to get back to you, but yes as @LeeYimingAI has mentioned, it seems the issue is that you are using a model which has an input size of (1, 3, 224, 224), while my code expect a model of size (1, 3, 112, 112). The test images I provide are of this size, so if you try to use it with a model expecting a different input size, it crashes. I have now both provided a sample model for you to try out, more robust code which will warn of any size mismatch errors, and have indicated where in the code you would resize your image if it is required.