Generate .wts file for TensorRT

yyuzhongpv commented 3 years ago

Hi,

Thanks for your great work. Looks like gen_wts.py is not working with End-to-end.pth, and would you please provide more details on how to generate .wts file? Appreciate very much if you can show the steps to generate the .engine file.

Regards

EXPmaster commented 3 years ago

Hi,

Thanks for your great work. Looks like gen_wts.py is not working with End-to-end.pth, and would you please provide more details on how to generate .wts file? Appreciate very much if you can show the steps to generate the .engine file.

Regards

I have fixed the gen_wts.py issue in the latest commit. You could refer to the Deployment part of the new README.md.

Also I modified the codes so that you can run main.cpp directly to obtain the .engine file. But I haven't tested this procedure because I cannot get access to the TensorRT device right now. Please contact me if you have any problem.

Thanks for your attention to our project!

yyuzhongpv commented 3 years ago

Thanks for your quick reply. I can generate yolop.wts now, however, found another issue while running yolop.

Building engine... Loading weights: yolop.wts [09/02/2021-14:42:33] [E] [TRT] (Unnamed Layer 214) [Convolution]: kernel weights has count 2304 but 6912 was expected [09/02/2021-14:42:33] [E] [TRT] (Unnamed Layer 214) [Convolution]: count of 2304 weights in kernel, but kernel dimensions (1,1) with 128 input channels, 54 output channels and 1 groups were specified. Expected Weights count is 128 11 54 / 1 = 6912 [09/02/2021-14:42:33] [E] [TRT] (Unnamed Layer 214) [Convolution]: kernel weights has count 2304 but 6912 was expected [09/02/2021-14:42:33] [E] [TRT] (Unnamed Layer 214) [Convolution]: count of 2304 weights in kernel, but kernel dimensions (1,1) with 128 input channels, 54 output channels and 1 groups were specified. Expected Weights count is 128 11 54 / 1 = 6912 [09/02/2021-14:42:33] [E] [TRT] (Unnamed Layer 214) [Convolution]: kernel weights has count 2304 but 6912 was expected [09/02/2021-14:42:33] [E] [TRT] (Unnamed Layer 214) [Convolution]: count of 2304 weights in kernel, but kernel dimensions (1,1) with 128 input channels, 54 output channels and 1 groups were specified. Expected Weights count is 128 11 54 / 1 = 6912 [09/02/2021-14:42:33] [E] [TRT] (Unnamed Layer 214) [Convolution]: kernel weights has count 2304 but 6912 was expected [09/02/2021-14:42:33] [E] [TRT] (Unnamed Layer 214) [Convolution]: count of 2304 weights in kernel, but kernel dimensions (1,1) with 128 input channels, 54 output channels and 1 groups were specified. Expected Weights count is 128 11 54 / 1 = 6912 [09/02/2021-14:42:33] [E] [TRT] (Unnamed Layer 214) [Convolution]: kernel weights has count 2304 but 6912 was expected [09/02/2021-14:42:33] [E] [TRT] (Unnamed Layer 214) [Convolution]: count of 2304 weights in kernel, but kernel dimensions (1,1) with 128 input channels, 54 output channels and 1 groups were specified. Expected Weights count is 128 11 54 / 1 = 6912 [09/02/2021-14:42:33] [E] [TRT] (Unnamed Layer 214) [Convolution]: kernel weights has count 2304 but 6912 was expected [09/02/2021-14:42:33] [E] [TRT] (Unnamed Layer 214) [Convolution]: count of 2304 weights in kernel, but kernel dimensions (1,1) with 128 input channels, 54 output channels and 1 groups were specified. Expected Weights count is 128 11 54 / 1 = 6912 [09/02/2021-14:42:33] [E] [TRT] (Unnamed Layer 214) [Convolution]: kernel weights has count 2304 but 6912 was expected [09/02/2021-14:42:33] [E] [TRT] (Unnamed Layer 214) [Convolution]: count of 2304 weights in kernel, but kernel dimensions (1,1) with 128 input channels, 54 output channels and 1 groups were specified. Expected Weights count is 128 11 54 / 1 = 6912 [09/02/2021-14:42:33] [E] [TRT] (Unnamed Layer 214) [Convolution]: kernel weights has count 2304 but 6912 was expected [09/02/2021-14:42:33] [E] [TRT] (Unnamed Layer 214) [Convolution]: count of 2304 weights in kernel, but kernel dimensions (1,1) with 128 input channels, 54 output channels and 1 groups were specified. Expected Weights count is 128 11 54 / 1 = 6912 Building engine, please wait for a while... [09/02/2021-14:42:33] [E] [TRT] (Unnamed Layer 214) [Convolution]: kernel weights has count 2304 but 6912 was expected [09/02/2021-14:42:33] [E] [TRT] (Unnamed Layer 214) [Convolution]: count of 2304 weights in kernel, but kernel dimensions (1,1) with 128 input channels, 54 output channels and 1 groups were specified. Expected Weights count is 128 11 54 / 1 = 6912 [09/02/2021-14:42:33] [E] [TRT] Could not compute dimensions for (Unnamed Layer 214) [Convolution]_output, because the network is not valid. [09/02/2021-14:42:33] [E] [TRT] Network validation failed. Build engine successfully! Segmentation fault (core dumped)

Any suggestions?

Regards.

ChenKQ commented 3 years ago

I met the same issues. Looking forward to the solution

ChenKQ commented 3 years ago

I changed the implementation of build_engine, and now it can be used to generate engine file. And the detection result are correct.

ICudaEngine* build_engine(unsigned int maxBatchSize, IBuilder* builder, IBuilderConfig* config, DataType dt, std::string& wts_name) {
    INetworkDefinition* network = builder->createNetworkV2(0U);

    // Create input tensor of shape {3, INPUT_H, INPUT_W} with name INPUT_BLOB_NAME
    ITensor* data = network->addInput(INPUT_BLOB_NAME, dt, Dims3{ 3, INPUT_H, INPUT_W });
    assert(data);
    // auto shuffle = network->addShuffle(*data);
    // shuffle->setReshapeDimensions(Dims3{ 3, INPUT_H, INPUT_W });
    // shuffle->setFirstTranspose(Permutation{ 2, 0, 1 });

    std::map<std::string, Weights> weightMap = loadWeights(wts_name);
    Weights emptywts{ DataType::kFLOAT, nullptr, 0 };

    // yolov5 backbone
    // auto focus0 = focus(network, weightMap, *shuffle->getOutput(0), 3, 32, 3, "model.0");
    auto focus0 = focus(network, weightMap, *data, 3, 32, 3, "model.0");
    auto conv1 = convBlock(network, weightMap, *focus0->getOutput(0), 64, 3, 2, 1, "model.1");
    auto bottleneck_CSP2 = bottleneckCSP(network, weightMap, *conv1->getOutput(0), 64, 64, 1, true, 1, 0.5, "model.2");
    auto conv3 = convBlock(network, weightMap, *bottleneck_CSP2->getOutput(0), 128, 3, 2, 1, "model.3");
    auto bottleneck_csp4 = bottleneckCSP(network, weightMap, *conv3->getOutput(0), 128, 128, 3, true, 1, 0.5, "model.4");
    auto conv5 = convBlock(network, weightMap, *bottleneck_csp4->getOutput(0), 256, 3, 2, 1, "model.5");
    auto bottleneck_csp6 = bottleneckCSP(network, weightMap, *conv5->getOutput(0), 256, 256, 3, true, 1, 0.5, "model.6");
    auto conv7 = convBlock(network, weightMap, *bottleneck_csp6->getOutput(0), 512, 3, 2, 1, "model.7");
    auto spp8 = SPP(network, weightMap, *conv7->getOutput(0), 512, 512, 5, 9, 13, "model.8");

    // yolov5 head
    auto bottleneck_csp9 = bottleneckCSP(network, weightMap, *spp8->getOutput(0), 512, 512, 1, false, 1, 0.5, "model.9");
    auto conv10 = convBlock(network, weightMap, *bottleneck_csp9->getOutput(0), 256, 1, 1, 1, "model.10");

    float *deval = reinterpret_cast<float*>(malloc(sizeof(float) * 256 * 2 * 2));
    for (int i = 0; i < 256 * 2 * 2; i++) {
        deval[i] = 1.0;
    }
    Weights deconvwts11{ DataType::kFLOAT, deval, 256 * 2 * 2 };
    IDeconvolutionLayer* deconv11 = network->addDeconvolutionNd(*conv10->getOutput(0), 256, DimsHW{ 2, 2 }, deconvwts11, emptywts);
    deconv11->setStrideNd(DimsHW{ 2, 2 });
    deconv11->setNbGroups(256);
    weightMap["deconv11"] = deconvwts11;

    ITensor* inputTensors12[] = { deconv11->getOutput(0), bottleneck_csp6->getOutput(0) };
    auto cat12 = network->addConcatenation(inputTensors12, 2);
    auto bottleneck_csp13 = bottleneckCSP(network, weightMap, *cat12->getOutput(0), 512, 256, 1, false, 1, 0.5, "model.13");
    auto conv14 = convBlock(network, weightMap, *bottleneck_csp13->getOutput(0), 128, 1, 1, 1, "model.14");

    Weights deconvwts15{ DataType::kFLOAT, deval, 128 * 2 * 2 };
    IDeconvolutionLayer* deconv15 = network->addDeconvolutionNd(*conv14->getOutput(0), 128, DimsHW{ 2, 2 }, deconvwts15, emptywts);
    deconv15->setStrideNd(DimsHW{ 2, 2 });
    deconv15->setNbGroups(128);

    ITensor* inputTensors16[] = { deconv15->getOutput(0), bottleneck_csp4->getOutput(0) };
    auto cat16 = network->addConcatenation(inputTensors16, 2);
    auto bottleneck_csp17 = bottleneckCSP(network, weightMap, *cat16->getOutput(0), 256, 128, 1, false, 1, 0.5, "model.17");
    IConvolutionLayer* det0 = network->addConvolutionNd(*bottleneck_csp17->getOutput(0), 3 * (Yolo::CLASS_NUM + 5), DimsHW{ 1, 1 }, weightMap["model.24.m.0.weight"], weightMap["model.24.m.0.bias"]);

    auto conv18 = convBlock(network, weightMap, *bottleneck_csp17->getOutput(0), 128, 3, 2, 1, "model.18");
    ITensor* inputTensors19[] = { conv18->getOutput(0), conv14->getOutput(0) };
    auto cat19 = network->addConcatenation(inputTensors19, 2);
    auto bottleneck_csp20 = bottleneckCSP(network, weightMap, *cat19->getOutput(0), 256, 256, 1, false, 1, 0.5, "model.20");
    IConvolutionLayer* det1 = network->addConvolutionNd(*bottleneck_csp20->getOutput(0), 3 * (Yolo::CLASS_NUM + 5), DimsHW{ 1, 1 }, weightMap["model.24.m.1.weight"], weightMap["model.24.m.1.bias"]);

    auto conv21 = convBlock(network, weightMap, *bottleneck_csp20->getOutput(0), 256, 3, 2, 1, "model.21");
    ITensor* inputTensors22[] = { conv21->getOutput(0), conv10->getOutput(0) };
    auto cat22 = network->addConcatenation(inputTensors22, 2);
    auto bottleneck_csp23 = bottleneckCSP(network, weightMap, *cat22->getOutput(0), 512, 512, 1, false, 1, 0.5, "model.23");
    IConvolutionLayer* det2 = network->addConvolutionNd(*bottleneck_csp23->getOutput(0), 3 * (Yolo::CLASS_NUM + 5), DimsHW{ 1, 1 }, weightMap["model.24.m.2.weight"], weightMap["model.24.m.2.bias"]);

    auto detect24 = addYoLoLayer(network, weightMap, det0, det1, det2);
    detect24->getOutput(0)->setName(OUTPUT_DET_NAME);

    auto conv25 = convBlock(network, weightMap, *cat16->getOutput(0), 128, 3, 1, 1, "model.25");
    // upsample 26
    Weights deconvwts26{ DataType::kFLOAT, deval, 128 * 2 * 2 };
    IDeconvolutionLayer* deconv26 = network->addDeconvolutionNd(*conv25->getOutput(0), 128, DimsHW{ 2, 2 }, deconvwts26, emptywts);
    deconv26->setStrideNd(DimsHW{ 2, 2 });
    deconv26->setNbGroups(128);

    auto bottleneck_csp27 = bottleneckCSP(network, weightMap, *deconv26->getOutput(0), 128, 64, 1, false, 1, 0.5, "model.27");
    auto conv28 = convBlock(network, weightMap, *bottleneck_csp27->getOutput(0), 32, 3, 1, 1, "model.28");
    // upsample 29
    Weights deconvwts29{ DataType::kFLOAT, deval, 32 * 2 * 2 };
    IDeconvolutionLayer* deconv29 = network->addDeconvolutionNd(*conv28->getOutput(0), 32, DimsHW{ 2, 2 }, deconvwts29, emptywts);
    deconv29->setStrideNd(DimsHW{ 2, 2 });
    deconv29->setNbGroups(32);

    auto conv30 = convBlock(network, weightMap, *deconv29->getOutput(0), 16, 3, 1, 1, "model.30");
    auto bottleneck_csp31 = bottleneckCSP(network, weightMap, *conv30->getOutput(0), 16, 8, 1, false, 1, 0.5, "model.31");

    // upsample32
    Weights deconvwts32{ DataType::kFLOAT, deval, 8 * 2 * 2 };
    IDeconvolutionLayer* deconv32 = network->addDeconvolutionNd(*bottleneck_csp31->getOutput(0), 8, DimsHW{ 2, 2 }, deconvwts32, emptywts);
    deconv32->setStrideNd(DimsHW{ 2, 2 });
    deconv32->setNbGroups(8);

    auto conv33 = convBlock(network, weightMap, *deconv32->getOutput(0), 2, 3, 1, 1, "model.33");
    // segmentation output
    ISliceLayer *slicelayer = network->addSlice(*conv33->getOutput(0), Dims3{ 0, (Yolo::INPUT_H - Yolo::IMG_H) / 2, 0 }, Dims3{ 2, Yolo::IMG_H, Yolo::IMG_W }, Dims3{ 1, 1, 1 });
    auto segout = network->addTopK(*slicelayer->getOutput(0), TopKOperation::kMAX, 1, 1);
    segout->getOutput(1)->setName(OUTPUT_SEG_NAME);

    auto conv34 = convBlock(network, weightMap, *cat16->getOutput(0), 128, 3, 1, 1, "model.34");

    // upsample35
    Weights deconvwts35{ DataType::kFLOAT, deval, 128 * 2 * 2 };
    IDeconvolutionLayer* deconv35 = network->addDeconvolutionNd(*conv34->getOutput(0), 128, DimsHW{ 2, 2 }, deconvwts35, emptywts);
    deconv35->setStrideNd(DimsHW{ 2, 2 });
    deconv35->setNbGroups(128);

    auto bottleneck_csp36 = bottleneckCSP(network, weightMap, *deconv35->getOutput(0), 128, 64, 1, false, 1, 0.5, "model.36");
    auto conv37 = convBlock(network, weightMap, *bottleneck_csp36->getOutput(0), 32, 3, 1, 1, "model.37");

    // upsample38
    Weights deconvwts38{ DataType::kFLOAT, deval, 32 * 2 * 2 };
    IDeconvolutionLayer* deconv38 = network->addDeconvolutionNd(*conv37->getOutput(0), 32, DimsHW{ 2, 2 }, deconvwts38, emptywts);
    deconv38->setStrideNd(DimsHW{ 2, 2 });
    deconv38->setNbGroups(32);

    auto conv39 = convBlock(network, weightMap, *deconv38->getOutput(0), 16, 3, 1, 1, "model.39");
    auto bottleneck_csp40 = bottleneckCSP(network, weightMap, *conv39->getOutput(0), 16, 8, 1, false, 1, 0.5, "model.40");

    // upsample41
    Weights deconvwts41{ DataType::kFLOAT, deval, 8 * 2 * 2 };
    IDeconvolutionLayer* deconv41 = network->addDeconvolutionNd(*bottleneck_csp40->getOutput(0), 8, DimsHW{ 2, 2 }, deconvwts41, emptywts);
    deconv41->setStrideNd(DimsHW{ 2, 2 });
    deconv41->setNbGroups(8);

    auto conv42 = convBlock(network, weightMap, *deconv41->getOutput(0), 2, 3, 1, 1, "model.42");
    // lane-det output
    ISliceLayer *laneSlice = network->addSlice(*conv42->getOutput(0), Dims3{ 0, (Yolo::INPUT_H - Yolo::IMG_H) / 2, 0 }, Dims3{ 2, Yolo::IMG_H, Yolo::IMG_W }, Dims3{ 1, 1, 1 });
    auto laneout = network->addTopK(*laneSlice->getOutput(0), TopKOperation::kMAX, 1, 1);
    laneout->getOutput(1)->setName(OUTPUT_LANE_NAME);

    // // std::cout << std::to_string(slicelayer->getOutput(0)->getDimensions().d[0]) << std::endl;
    // // ISliceLayer *tmp1 = network->addSlice(*slicelayer->getOutput(0), Dims3{ 0, 0, 0 }, Dims3{ 1, (Yolo::INPUT_H - 2 * Yolo::PAD_H), Yolo::INPUT_W }, Dims3{ 1, 1, 1 });
    // // ISliceLayer *tmp2 = network->addSlice(*slicelayer->getOutput(0), Dims3{ 1, 0, 0 }, Dims3{ 1, (Yolo::INPUT_H - 2 * Yolo::PAD_H), Yolo::INPUT_W }, Dims3{ 1, 1, 1 });
    // // auto segout = network->addElementWise(*tmp1->getOutput(0), *tmp2->getOutput(0), ElementWiseOperation::kLESS);
    // std::cout << std::to_string(conv44->getOutput(0)->getDimensions().d[0]) << std::endl;
    // std::cout << std::to_string(conv44->getOutput(0)->getDimensions().d[1]) << std::endl;
    // std::cout << std::to_string(conv44->getOutput(0)->getDimensions().d[2]) << std::endl;
    // assert(false);
    // // segout->setOutputType(1, DataType::kFLOAT);
    // segout->getOutput(1)->setName(OUTPUT_SEG_NAME);
    // // std::cout << std::to_string(segout->getOutput(1)->getDimensions().d[0]) << std::endl;

    // detection output
    network->markOutput(*detect24->getOutput(0));
    // segmentation output
    network->markOutput(*segout->getOutput(1));
    // lane output
    network->markOutput(*laneout->getOutput(1));

    // assert(false);

    // Build engine
    builder->setMaxBatchSize(maxBatchSize);
    config->setMaxWorkspaceSize(2L * (1L << 30));  // 2GB
#if defined(USE_FP16)
    config->setFlag(BuilderFlag::kFP16);
// #elif defined(USE_INT8)
//     std::cout << "Your platform support int8: " << (builder->platformHasFastInt8() ? "true" : "false") << std::endl;
//     assert(builder->platformHasFastInt8());
//     config->setFlag(BuilderFlag::kINT8);
//     Int8EntropyCalibrator2* calibrator = new Int8EntropyCalibrator2(1, INPUT_W, INPUT_H, "./coco_calib/", "int8calib.table", INPUT_BLOB_NAME);
//     config->setInt8Calibrator(calibrator);
#endif

    std::cout << "Building engine, please wait for a while..." << std::endl;
    ICudaEngine* engine = builder->buildEngineWithConfig(*network, *config);
    std::cout << "Build engine successfully!" << std::endl;

    // Don't need the network any more
    network->destroy();

    // Release host memory
    for (auto& mem : weightMap)
    {
        free((void*)(mem.second.values));
    }

    return engine;
}

healthy8701 commented 3 years ago

change CLASS_NUM in yololayer.h static constexpr int CLASS_NUM = 1;

rsj007 commented 3 years ago

Hi, Thanks for your great work. Looks like gen_wts.py is not working with End-to-end.pth, and would you please provide more details on how to generate .wts file? Appreciate very much if you can show the steps to generate the .engine file. Regards

I have fixed the gen_wts.py issue in the latest commit. You could refer to the Deployment part of the new README.md.

Also I modified the codes so that you can run main.cpp directly to obtain the .engine file. But I haven't tested this procedure because I cannot get access to the TensorRT device right now. Please contact me if you have any problem.

Thanks for your attention to our project!

Hello! I also meet the issue you mentioned. Have you solved it?

SikandAlex commented 3 years ago

In order to get this working I had to perform the following steps:

Compile OpenCV from source and include the OpenCV contrib modules and with CUDA Change "coda" to "cuda" when you encounter the error related to that Take the int8 calibrator.h from the tensorrtx repository file since it is missing from this repo and put it in the deploy folder

Then I was able to build ./yolo

Now I am running it but it has been stuck on building the engine but no error has been output yet. I am about to exit the process and then change the CLASS_NUM as @healthy8701 suggests.

SikandAlex commented 3 years ago

After implementing @healthy8701 suggestion I encountered the error:

[E] [TRT] Parameter check failed at: ../builder/Network.cpp::addScale::494, condition: shift.count > 0 ? (shift.values != nullptr) : (shift.values == nullptr)

I then replaced the build_engine function code with the code @ChenKQ so kindly provided to us. Now it is running again but not sure if it will successfully build the engine. I will inform you of updates.

SikandAlex commented 3 years ago

@ChenKQ can you please tell me how long it took to build the engine file? Thank you so much.

rsj007 commented 3 years ago

@ChenKQ @SikandAlex Hello, I wonder which platform do you convert .wts to tensorrt? Jetson devices or PC? I want to deploy the project on TX2. Can I do all the things on TX2? Thanks!

SikandAlex commented 3 years ago

I am simply trying to first convert the model to a TensorRT engine on my RTX 2080 Ti before I try to build an engine for the AGX Xavier.

Unfortunately I left ./yolop running after successfully compiling it and it did not finish execution after 12 hours:

Building engine...
Loading weights: yolop.wts

In addition, Ctrl-C and keyboard input did not break execution, I have to forcibly kill the process.

The process was using 291 MB of GPU memory and 2% utilization.

Normally, following the directions in the tensorrtx repository https://github.com/wang-xinyu/tensorrtx allows me to build an engine for a normal yolov5 model in only a few minutes.

There is something wrong here with one of the files. I am not sure what to do. I followed @ChenKQ directions and everything looks good.

SikandAlex commented 3 years ago

This line from @ChenKQ build_engine is never printed to my screen.

std::cout << "Building engine, please wait for a while..." << std::endl;

I wonder if it cannot load the weights properly. This is very hard to debug because there is no error output.

@Riser6 @EXPmaster can you please assist so we can reproduce your FPS on embedded/mobile GPU ? I think this is most important issue for repo.

rsj007 commented 3 years ago

Thanks for your quick reply. I can generate yolop.wts now, however, found another issue while running yolop.

Building engine... Loading weights: yolop.wts [09/02/2021-14:42:33] [E] [TRT] (Unnamed Layer 214) [Convolution]: kernel weights has count 2304 but 6912 was expected [09/02/2021-14:42:33] [E] [TRT] (Unnamed Layer 214) [Convolution]: count of 2304 weights in kernel, but kernel dimensions (1,1) with 128 input channels, 54 output channels and 1 groups were specified. Expected Weights count is 128 1_1 54 / 1 = 6912 [09/02/2021-14:42:33] [E] [TRT] (Unnamed Layer_ 214) [Convolution]: kernel weights has count 2304 but 6912 was expected [09/02/2021-14:42:33] [E] [TRT] (Unnamed Layer 214) [Convolution]: count of 2304 weights in kernel, but kernel dimensions (1,1) with 128 input channels, 54 output channels and 1 groups were specified. Expected Weights count is 128 11 * 54 / 1 = 6912 [09/02/2021-14:42:33] [E] [TRT] (Unnamed Layer 214) [Convolution]: kernel weights has count 2304 but 6912 was expected [09/02/2021-14:42:33] [E] [TRT] (Unnamed Layer 214) [Convolution]: count of 2304 weights in kernel, but kernel dimensions (1,1) with 128 input channels, 54 output channels and 1 groups were specified. Expected Weights count is 128 11 * 54 / 1 = 6912 [09/02/2021-14:42:33] [E] [TRT] (Unnamed Layer 214) [Convolution]: kernel weights has count 2304 but 6912 was expected [09/02/2021-14:42:33] [E] [TRT] (Unnamed Layer 214) [Convolution]: count of 2304 weights in kernel, but kernel dimensions (1,1) with 128 input channels, 54 output channels and 1 groups were specified. Expected Weights count is 128 11 * 54 / 1 = 6912 [09/02/2021-14:42:33] [E] [TRT] (Unnamed Layer 214) [Convolution]: kernel weights has count 2304 but 6912 was expected [09/02/2021-14:42:33] [E] [TRT] (Unnamed Layer 214) [Convolution]: count of 2304 weights in kernel, but kernel dimensions (1,1) with 128 input channels, 54 output channels and 1 groups were specified. Expected Weights count is 128 11 * 54 / 1 = 6912 [09/02/2021-14:42:33] [E] [TRT] (Unnamed Layer 214) [Convolution]: kernel weights has count 2304 but 6912 was expected [09/02/2021-14:42:33] [E] [TRT] (Unnamed Layer 214) [Convolution]: count of 2304 weights in kernel, but kernel dimensions (1,1) with 128 input channels, 54 output channels and 1 groups were specified. Expected Weights count is 128 11 * 54 / 1 = 6912 [09/02/2021-14:42:33] [E] [TRT] (Unnamed Layer 214) [Convolution]: kernel weights has count 2304 but 6912 was expected [09/02/2021-14:42:33] [E] [TRT] (Unnamed Layer 214) [Convolution]: count of 2304 weights in kernel, but kernel dimensions (1,1) with 128 input channels, 54 output channels and 1 groups were specified. Expected Weights count is 128 11 * 54 / 1 = 6912 [09/02/2021-14:42:33] [E] [TRT] (Unnamed Layer 214) [Convolution]: kernel weights has count 2304 but 6912 was expected [09/02/2021-14:42:33] [E] [TRT] (Unnamed Layer 214) [Convolution]: count of 2304 weights in kernel, but kernel dimensions (1,1) with 128 input channels, 54 output channels and 1 groups were specified. Expected Weights count is 128 11 * 54 / 1 = 6912 Building engine, please wait for a while... [09/02/2021-14:42:33] [E] [TRT] (Unnamed Layer 214) [Convolution]: kernel weights has count 2304 but 6912 was expected [09/02/2021-14:42:33] [E] [TRT] (Unnamed Layer 214) [Convolution]: count of 2304 weights in kernel, but kernel dimensions (1,1) with 128 input channels, 54 output channels and 1 groups were specified. Expected Weights count is 128 11 * 54 / 1 = 6912 [09/02/2021-14:42:33] [E] [TRT] Could not compute dimensions for (Unnamed Layer 214) [Convolution]_output, because the network is not valid. [09/02/2021-14:42:33] [E] [TRT] Network validation failed. Build engine successfully! Segmentation fault (core dumped)

Any suggestions?

Regards.

Hello, have you solved this problem? Can the engine be builded without error?

SikandAlex commented 3 years ago

No, I was not able to. I have opted for different architecture since the authors were not helpful.

ausk commented 2 years ago

Fixed by: https://github.com/hustvl/YOLOP/pull/162 ; https://github.com/ausk/YOLOP

hustvl / YOLOP

Generate .wts file for TensorRT #12