Open yyuzhongpv opened 3 years ago
Hi,
Thanks for your great work. Looks like gen_wts.py is not working with End-to-end.pth, and would you please provide more details on how to generate .wts file? Appreciate very much if you can show the steps to generate the .engine file.
Regards
I have fixed the gen_wts.py issue in the latest commit. You could refer to the Deployment part of the new README.md.
Also I modified the codes so that you can run main.cpp directly to obtain the .engine file. But I haven't tested this procedure because I cannot get access to the TensorRT device right now. Please contact me if you have any problem.
Thanks for your attention to our project!
Thanks for your quick reply. I can generate yolop.wts now, however, found another issue while running yolop.
Building engine... Loading weights: yolop.wts [09/02/2021-14:42:33] [E] [TRT] (Unnamed Layer 214) [Convolution]: kernel weights has count 2304 but 6912 was expected [09/02/2021-14:42:33] [E] [TRT] (Unnamed Layer 214) [Convolution]: count of 2304 weights in kernel, but kernel dimensions (1,1) with 128 input channels, 54 output channels and 1 groups were specified. Expected Weights count is 128 11 54 / 1 = 6912 [09/02/2021-14:42:33] [E] [TRT] (Unnamed Layer 214) [Convolution]: kernel weights has count 2304 but 6912 was expected [09/02/2021-14:42:33] [E] [TRT] (Unnamed Layer 214) [Convolution]: count of 2304 weights in kernel, but kernel dimensions (1,1) with 128 input channels, 54 output channels and 1 groups were specified. Expected Weights count is 128 11 54 / 1 = 6912 [09/02/2021-14:42:33] [E] [TRT] (Unnamed Layer 214) [Convolution]: kernel weights has count 2304 but 6912 was expected [09/02/2021-14:42:33] [E] [TRT] (Unnamed Layer 214) [Convolution]: count of 2304 weights in kernel, but kernel dimensions (1,1) with 128 input channels, 54 output channels and 1 groups were specified. Expected Weights count is 128 11 54 / 1 = 6912 [09/02/2021-14:42:33] [E] [TRT] (Unnamed Layer 214) [Convolution]: kernel weights has count 2304 but 6912 was expected [09/02/2021-14:42:33] [E] [TRT] (Unnamed Layer 214) [Convolution]: count of 2304 weights in kernel, but kernel dimensions (1,1) with 128 input channels, 54 output channels and 1 groups were specified. Expected Weights count is 128 11 54 / 1 = 6912 [09/02/2021-14:42:33] [E] [TRT] (Unnamed Layer 214) [Convolution]: kernel weights has count 2304 but 6912 was expected [09/02/2021-14:42:33] [E] [TRT] (Unnamed Layer 214) [Convolution]: count of 2304 weights in kernel, but kernel dimensions (1,1) with 128 input channels, 54 output channels and 1 groups were specified. Expected Weights count is 128 11 54 / 1 = 6912 [09/02/2021-14:42:33] [E] [TRT] (Unnamed Layer 214) [Convolution]: kernel weights has count 2304 but 6912 was expected [09/02/2021-14:42:33] [E] [TRT] (Unnamed Layer 214) [Convolution]: count of 2304 weights in kernel, but kernel dimensions (1,1) with 128 input channels, 54 output channels and 1 groups were specified. Expected Weights count is 128 11 54 / 1 = 6912 [09/02/2021-14:42:33] [E] [TRT] (Unnamed Layer 214) [Convolution]: kernel weights has count 2304 but 6912 was expected [09/02/2021-14:42:33] [E] [TRT] (Unnamed Layer 214) [Convolution]: count of 2304 weights in kernel, but kernel dimensions (1,1) with 128 input channels, 54 output channels and 1 groups were specified. Expected Weights count is 128 11 54 / 1 = 6912 [09/02/2021-14:42:33] [E] [TRT] (Unnamed Layer 214) [Convolution]: kernel weights has count 2304 but 6912 was expected [09/02/2021-14:42:33] [E] [TRT] (Unnamed Layer 214) [Convolution]: count of 2304 weights in kernel, but kernel dimensions (1,1) with 128 input channels, 54 output channels and 1 groups were specified. Expected Weights count is 128 11 54 / 1 = 6912 Building engine, please wait for a while... [09/02/2021-14:42:33] [E] [TRT] (Unnamed Layer 214) [Convolution]: kernel weights has count 2304 but 6912 was expected [09/02/2021-14:42:33] [E] [TRT] (Unnamed Layer 214) [Convolution]: count of 2304 weights in kernel, but kernel dimensions (1,1) with 128 input channels, 54 output channels and 1 groups were specified. Expected Weights count is 128 11 54 / 1 = 6912 [09/02/2021-14:42:33] [E] [TRT] Could not compute dimensions for (Unnamed Layer 214) [Convolution]_output, because the network is not valid. [09/02/2021-14:42:33] [E] [TRT] Network validation failed. Build engine successfully! Segmentation fault (core dumped)
Any suggestions?
Regards.
I met the same issues. Looking forward to the solution
I changed the implementation of build_engine, and now it can be used to generate engine file. And the detection result are correct.
ICudaEngine* build_engine(unsigned int maxBatchSize, IBuilder* builder, IBuilderConfig* config, DataType dt, std::string& wts_name) {
INetworkDefinition* network = builder->createNetworkV2(0U);
// Create input tensor of shape {3, INPUT_H, INPUT_W} with name INPUT_BLOB_NAME
ITensor* data = network->addInput(INPUT_BLOB_NAME, dt, Dims3{ 3, INPUT_H, INPUT_W });
assert(data);
// auto shuffle = network->addShuffle(*data);
// shuffle->setReshapeDimensions(Dims3{ 3, INPUT_H, INPUT_W });
// shuffle->setFirstTranspose(Permutation{ 2, 0, 1 });
std::map<std::string, Weights> weightMap = loadWeights(wts_name);
Weights emptywts{ DataType::kFLOAT, nullptr, 0 };
// yolov5 backbone
// auto focus0 = focus(network, weightMap, *shuffle->getOutput(0), 3, 32, 3, "model.0");
auto focus0 = focus(network, weightMap, *data, 3, 32, 3, "model.0");
auto conv1 = convBlock(network, weightMap, *focus0->getOutput(0), 64, 3, 2, 1, "model.1");
auto bottleneck_CSP2 = bottleneckCSP(network, weightMap, *conv1->getOutput(0), 64, 64, 1, true, 1, 0.5, "model.2");
auto conv3 = convBlock(network, weightMap, *bottleneck_CSP2->getOutput(0), 128, 3, 2, 1, "model.3");
auto bottleneck_csp4 = bottleneckCSP(network, weightMap, *conv3->getOutput(0), 128, 128, 3, true, 1, 0.5, "model.4");
auto conv5 = convBlock(network, weightMap, *bottleneck_csp4->getOutput(0), 256, 3, 2, 1, "model.5");
auto bottleneck_csp6 = bottleneckCSP(network, weightMap, *conv5->getOutput(0), 256, 256, 3, true, 1, 0.5, "model.6");
auto conv7 = convBlock(network, weightMap, *bottleneck_csp6->getOutput(0), 512, 3, 2, 1, "model.7");
auto spp8 = SPP(network, weightMap, *conv7->getOutput(0), 512, 512, 5, 9, 13, "model.8");
// yolov5 head
auto bottleneck_csp9 = bottleneckCSP(network, weightMap, *spp8->getOutput(0), 512, 512, 1, false, 1, 0.5, "model.9");
auto conv10 = convBlock(network, weightMap, *bottleneck_csp9->getOutput(0), 256, 1, 1, 1, "model.10");
float *deval = reinterpret_cast<float*>(malloc(sizeof(float) * 256 * 2 * 2));
for (int i = 0; i < 256 * 2 * 2; i++) {
deval[i] = 1.0;
}
Weights deconvwts11{ DataType::kFLOAT, deval, 256 * 2 * 2 };
IDeconvolutionLayer* deconv11 = network->addDeconvolutionNd(*conv10->getOutput(0), 256, DimsHW{ 2, 2 }, deconvwts11, emptywts);
deconv11->setStrideNd(DimsHW{ 2, 2 });
deconv11->setNbGroups(256);
weightMap["deconv11"] = deconvwts11;
ITensor* inputTensors12[] = { deconv11->getOutput(0), bottleneck_csp6->getOutput(0) };
auto cat12 = network->addConcatenation(inputTensors12, 2);
auto bottleneck_csp13 = bottleneckCSP(network, weightMap, *cat12->getOutput(0), 512, 256, 1, false, 1, 0.5, "model.13");
auto conv14 = convBlock(network, weightMap, *bottleneck_csp13->getOutput(0), 128, 1, 1, 1, "model.14");
Weights deconvwts15{ DataType::kFLOAT, deval, 128 * 2 * 2 };
IDeconvolutionLayer* deconv15 = network->addDeconvolutionNd(*conv14->getOutput(0), 128, DimsHW{ 2, 2 }, deconvwts15, emptywts);
deconv15->setStrideNd(DimsHW{ 2, 2 });
deconv15->setNbGroups(128);
ITensor* inputTensors16[] = { deconv15->getOutput(0), bottleneck_csp4->getOutput(0) };
auto cat16 = network->addConcatenation(inputTensors16, 2);
auto bottleneck_csp17 = bottleneckCSP(network, weightMap, *cat16->getOutput(0), 256, 128, 1, false, 1, 0.5, "model.17");
IConvolutionLayer* det0 = network->addConvolutionNd(*bottleneck_csp17->getOutput(0), 3 * (Yolo::CLASS_NUM + 5), DimsHW{ 1, 1 }, weightMap["model.24.m.0.weight"], weightMap["model.24.m.0.bias"]);
auto conv18 = convBlock(network, weightMap, *bottleneck_csp17->getOutput(0), 128, 3, 2, 1, "model.18");
ITensor* inputTensors19[] = { conv18->getOutput(0), conv14->getOutput(0) };
auto cat19 = network->addConcatenation(inputTensors19, 2);
auto bottleneck_csp20 = bottleneckCSP(network, weightMap, *cat19->getOutput(0), 256, 256, 1, false, 1, 0.5, "model.20");
IConvolutionLayer* det1 = network->addConvolutionNd(*bottleneck_csp20->getOutput(0), 3 * (Yolo::CLASS_NUM + 5), DimsHW{ 1, 1 }, weightMap["model.24.m.1.weight"], weightMap["model.24.m.1.bias"]);
auto conv21 = convBlock(network, weightMap, *bottleneck_csp20->getOutput(0), 256, 3, 2, 1, "model.21");
ITensor* inputTensors22[] = { conv21->getOutput(0), conv10->getOutput(0) };
auto cat22 = network->addConcatenation(inputTensors22, 2);
auto bottleneck_csp23 = bottleneckCSP(network, weightMap, *cat22->getOutput(0), 512, 512, 1, false, 1, 0.5, "model.23");
IConvolutionLayer* det2 = network->addConvolutionNd(*bottleneck_csp23->getOutput(0), 3 * (Yolo::CLASS_NUM + 5), DimsHW{ 1, 1 }, weightMap["model.24.m.2.weight"], weightMap["model.24.m.2.bias"]);
auto detect24 = addYoLoLayer(network, weightMap, det0, det1, det2);
detect24->getOutput(0)->setName(OUTPUT_DET_NAME);
auto conv25 = convBlock(network, weightMap, *cat16->getOutput(0), 128, 3, 1, 1, "model.25");
// upsample 26
Weights deconvwts26{ DataType::kFLOAT, deval, 128 * 2 * 2 };
IDeconvolutionLayer* deconv26 = network->addDeconvolutionNd(*conv25->getOutput(0), 128, DimsHW{ 2, 2 }, deconvwts26, emptywts);
deconv26->setStrideNd(DimsHW{ 2, 2 });
deconv26->setNbGroups(128);
auto bottleneck_csp27 = bottleneckCSP(network, weightMap, *deconv26->getOutput(0), 128, 64, 1, false, 1, 0.5, "model.27");
auto conv28 = convBlock(network, weightMap, *bottleneck_csp27->getOutput(0), 32, 3, 1, 1, "model.28");
// upsample 29
Weights deconvwts29{ DataType::kFLOAT, deval, 32 * 2 * 2 };
IDeconvolutionLayer* deconv29 = network->addDeconvolutionNd(*conv28->getOutput(0), 32, DimsHW{ 2, 2 }, deconvwts29, emptywts);
deconv29->setStrideNd(DimsHW{ 2, 2 });
deconv29->setNbGroups(32);
auto conv30 = convBlock(network, weightMap, *deconv29->getOutput(0), 16, 3, 1, 1, "model.30");
auto bottleneck_csp31 = bottleneckCSP(network, weightMap, *conv30->getOutput(0), 16, 8, 1, false, 1, 0.5, "model.31");
// upsample32
Weights deconvwts32{ DataType::kFLOAT, deval, 8 * 2 * 2 };
IDeconvolutionLayer* deconv32 = network->addDeconvolutionNd(*bottleneck_csp31->getOutput(0), 8, DimsHW{ 2, 2 }, deconvwts32, emptywts);
deconv32->setStrideNd(DimsHW{ 2, 2 });
deconv32->setNbGroups(8);
auto conv33 = convBlock(network, weightMap, *deconv32->getOutput(0), 2, 3, 1, 1, "model.33");
// segmentation output
ISliceLayer *slicelayer = network->addSlice(*conv33->getOutput(0), Dims3{ 0, (Yolo::INPUT_H - Yolo::IMG_H) / 2, 0 }, Dims3{ 2, Yolo::IMG_H, Yolo::IMG_W }, Dims3{ 1, 1, 1 });
auto segout = network->addTopK(*slicelayer->getOutput(0), TopKOperation::kMAX, 1, 1);
segout->getOutput(1)->setName(OUTPUT_SEG_NAME);
auto conv34 = convBlock(network, weightMap, *cat16->getOutput(0), 128, 3, 1, 1, "model.34");
// upsample35
Weights deconvwts35{ DataType::kFLOAT, deval, 128 * 2 * 2 };
IDeconvolutionLayer* deconv35 = network->addDeconvolutionNd(*conv34->getOutput(0), 128, DimsHW{ 2, 2 }, deconvwts35, emptywts);
deconv35->setStrideNd(DimsHW{ 2, 2 });
deconv35->setNbGroups(128);
auto bottleneck_csp36 = bottleneckCSP(network, weightMap, *deconv35->getOutput(0), 128, 64, 1, false, 1, 0.5, "model.36");
auto conv37 = convBlock(network, weightMap, *bottleneck_csp36->getOutput(0), 32, 3, 1, 1, "model.37");
// upsample38
Weights deconvwts38{ DataType::kFLOAT, deval, 32 * 2 * 2 };
IDeconvolutionLayer* deconv38 = network->addDeconvolutionNd(*conv37->getOutput(0), 32, DimsHW{ 2, 2 }, deconvwts38, emptywts);
deconv38->setStrideNd(DimsHW{ 2, 2 });
deconv38->setNbGroups(32);
auto conv39 = convBlock(network, weightMap, *deconv38->getOutput(0), 16, 3, 1, 1, "model.39");
auto bottleneck_csp40 = bottleneckCSP(network, weightMap, *conv39->getOutput(0), 16, 8, 1, false, 1, 0.5, "model.40");
// upsample41
Weights deconvwts41{ DataType::kFLOAT, deval, 8 * 2 * 2 };
IDeconvolutionLayer* deconv41 = network->addDeconvolutionNd(*bottleneck_csp40->getOutput(0), 8, DimsHW{ 2, 2 }, deconvwts41, emptywts);
deconv41->setStrideNd(DimsHW{ 2, 2 });
deconv41->setNbGroups(8);
auto conv42 = convBlock(network, weightMap, *deconv41->getOutput(0), 2, 3, 1, 1, "model.42");
// lane-det output
ISliceLayer *laneSlice = network->addSlice(*conv42->getOutput(0), Dims3{ 0, (Yolo::INPUT_H - Yolo::IMG_H) / 2, 0 }, Dims3{ 2, Yolo::IMG_H, Yolo::IMG_W }, Dims3{ 1, 1, 1 });
auto laneout = network->addTopK(*laneSlice->getOutput(0), TopKOperation::kMAX, 1, 1);
laneout->getOutput(1)->setName(OUTPUT_LANE_NAME);
// // std::cout << std::to_string(slicelayer->getOutput(0)->getDimensions().d[0]) << std::endl;
// // ISliceLayer *tmp1 = network->addSlice(*slicelayer->getOutput(0), Dims3{ 0, 0, 0 }, Dims3{ 1, (Yolo::INPUT_H - 2 * Yolo::PAD_H), Yolo::INPUT_W }, Dims3{ 1, 1, 1 });
// // ISliceLayer *tmp2 = network->addSlice(*slicelayer->getOutput(0), Dims3{ 1, 0, 0 }, Dims3{ 1, (Yolo::INPUT_H - 2 * Yolo::PAD_H), Yolo::INPUT_W }, Dims3{ 1, 1, 1 });
// // auto segout = network->addElementWise(*tmp1->getOutput(0), *tmp2->getOutput(0), ElementWiseOperation::kLESS);
// std::cout << std::to_string(conv44->getOutput(0)->getDimensions().d[0]) << std::endl;
// std::cout << std::to_string(conv44->getOutput(0)->getDimensions().d[1]) << std::endl;
// std::cout << std::to_string(conv44->getOutput(0)->getDimensions().d[2]) << std::endl;
// assert(false);
// // segout->setOutputType(1, DataType::kFLOAT);
// segout->getOutput(1)->setName(OUTPUT_SEG_NAME);
// // std::cout << std::to_string(segout->getOutput(1)->getDimensions().d[0]) << std::endl;
// detection output
network->markOutput(*detect24->getOutput(0));
// segmentation output
network->markOutput(*segout->getOutput(1));
// lane output
network->markOutput(*laneout->getOutput(1));
// assert(false);
// Build engine
builder->setMaxBatchSize(maxBatchSize);
config->setMaxWorkspaceSize(2L * (1L << 30)); // 2GB
#if defined(USE_FP16)
config->setFlag(BuilderFlag::kFP16);
// #elif defined(USE_INT8)
// std::cout << "Your platform support int8: " << (builder->platformHasFastInt8() ? "true" : "false") << std::endl;
// assert(builder->platformHasFastInt8());
// config->setFlag(BuilderFlag::kINT8);
// Int8EntropyCalibrator2* calibrator = new Int8EntropyCalibrator2(1, INPUT_W, INPUT_H, "./coco_calib/", "int8calib.table", INPUT_BLOB_NAME);
// config->setInt8Calibrator(calibrator);
#endif
std::cout << "Building engine, please wait for a while..." << std::endl;
ICudaEngine* engine = builder->buildEngineWithConfig(*network, *config);
std::cout << "Build engine successfully!" << std::endl;
// Don't need the network any more
network->destroy();
// Release host memory
for (auto& mem : weightMap)
{
free((void*)(mem.second.values));
}
return engine;
}
change CLASS_NUM in yololayer.h static constexpr int CLASS_NUM = 1;
Hi, Thanks for your great work. Looks like gen_wts.py is not working with End-to-end.pth, and would you please provide more details on how to generate .wts file? Appreciate very much if you can show the steps to generate the .engine file. Regards
I have fixed the gen_wts.py issue in the latest commit. You could refer to the Deployment part of the new README.md.
Also I modified the codes so that you can run main.cpp directly to obtain the .engine file. But I haven't tested this procedure because I cannot get access to the TensorRT device right now. Please contact me if you have any problem.
Thanks for your attention to our project!
Hello! I also meet the issue you mentioned. Have you solved it?
In order to get this working I had to perform the following steps:
Compile OpenCV from source and include the OpenCV contrib modules and with CUDA Change "coda" to "cuda" when you encounter the error related to that Take the int8 calibrator.h from the tensorrtx repository file since it is missing from this repo and put it in the deploy folder
Then I was able to build ./yolo
Now I am running it but it has been stuck on building the engine but no error has been output yet. I am about to exit the process and then change the CLASS_NUM as @healthy8701 suggests.
After implementing @healthy8701 suggestion I encountered the error:
[E] [TRT] Parameter check failed at: ../builder/Network.cpp::addScale::494, condition: shift.count > 0 ? (shift.values != nullptr) : (shift.values == nullptr)
I then replaced the build_engine function code with the code @ChenKQ so kindly provided to us. Now it is running again but not sure if it will successfully build the engine. I will inform you of updates.
@ChenKQ can you please tell me how long it took to build the engine file? Thank you so much.
@ChenKQ @SikandAlex Hello, I wonder which platform do you convert .wts to tensorrt? Jetson devices or PC? I want to deploy the project on TX2. Can I do all the things on TX2? Thanks!
I am simply trying to first convert the model to a TensorRT engine on my RTX 2080 Ti before I try to build an engine for the AGX Xavier.
Unfortunately I left ./yolop running after successfully compiling it and it did not finish execution after 12 hours:
Building engine...
Loading weights: yolop.wts
In addition, Ctrl-C and keyboard input did not break execution, I have to forcibly kill the process.
The process was using 291 MB of GPU memory and 2% utilization.
Normally, following the directions in the tensorrtx repository https://github.com/wang-xinyu/tensorrtx allows me to build an engine for a normal yolov5 model in only a few minutes.
There is something wrong here with one of the files. I am not sure what to do. I followed @ChenKQ directions and everything looks good.
This line from @ChenKQ build_engine is never printed to my screen.
std::cout << "Building engine, please wait for a while..." << std::endl;
I wonder if it cannot load the weights properly. This is very hard to debug because there is no error output.
@Riser6 @EXPmaster can you please assist so we can reproduce your FPS on embedded/mobile GPU ? I think this is most important issue for repo.
Thanks for your quick reply. I can generate yolop.wts now, however, found another issue while running yolop.
Building engine... Loading weights: yolop.wts [09/02/2021-14:42:33] [E] [TRT] (Unnamed Layer 214) [Convolution]: kernel weights has count 2304 but 6912 was expected [09/02/2021-14:42:33] [E] [TRT] (Unnamed Layer 214) [Convolution]: count of 2304 weights in kernel, but kernel dimensions (1,1) with 128 input channels, 54 output channels and 1 groups were specified. Expected Weights count is 128 1_1 54 / 1 = 6912 [09/02/2021-14:42:33] [E] [TRT] (Unnamed Layer_ 214) [Convolution]: kernel weights has count 2304 but 6912 was expected [09/02/2021-14:42:33] [E] [TRT] (Unnamed Layer 214) [Convolution]: count of 2304 weights in kernel, but kernel dimensions (1,1) with 128 input channels, 54 output channels and 1 groups were specified. Expected Weights count is 128 11 * 54 / 1 = 6912 [09/02/2021-14:42:33] [E] [TRT] (Unnamed Layer 214) [Convolution]: kernel weights has count 2304 but 6912 was expected [09/02/2021-14:42:33] [E] [TRT] (Unnamed Layer 214) [Convolution]: count of 2304 weights in kernel, but kernel dimensions (1,1) with 128 input channels, 54 output channels and 1 groups were specified. Expected Weights count is 128 11 * 54 / 1 = 6912 [09/02/2021-14:42:33] [E] [TRT] (Unnamed Layer 214) [Convolution]: kernel weights has count 2304 but 6912 was expected [09/02/2021-14:42:33] [E] [TRT] (Unnamed Layer 214) [Convolution]: count of 2304 weights in kernel, but kernel dimensions (1,1) with 128 input channels, 54 output channels and 1 groups were specified. Expected Weights count is 128 11 * 54 / 1 = 6912 [09/02/2021-14:42:33] [E] [TRT] (Unnamed Layer 214) [Convolution]: kernel weights has count 2304 but 6912 was expected [09/02/2021-14:42:33] [E] [TRT] (Unnamed Layer 214) [Convolution]: count of 2304 weights in kernel, but kernel dimensions (1,1) with 128 input channels, 54 output channels and 1 groups were specified. Expected Weights count is 128 11 * 54 / 1 = 6912 [09/02/2021-14:42:33] [E] [TRT] (Unnamed Layer 214) [Convolution]: kernel weights has count 2304 but 6912 was expected [09/02/2021-14:42:33] [E] [TRT] (Unnamed Layer 214) [Convolution]: count of 2304 weights in kernel, but kernel dimensions (1,1) with 128 input channels, 54 output channels and 1 groups were specified. Expected Weights count is 128 11 * 54 / 1 = 6912 [09/02/2021-14:42:33] [E] [TRT] (Unnamed Layer 214) [Convolution]: kernel weights has count 2304 but 6912 was expected [09/02/2021-14:42:33] [E] [TRT] (Unnamed Layer 214) [Convolution]: count of 2304 weights in kernel, but kernel dimensions (1,1) with 128 input channels, 54 output channels and 1 groups were specified. Expected Weights count is 128 11 * 54 / 1 = 6912 [09/02/2021-14:42:33] [E] [TRT] (Unnamed Layer 214) [Convolution]: kernel weights has count 2304 but 6912 was expected [09/02/2021-14:42:33] [E] [TRT] (Unnamed Layer 214) [Convolution]: count of 2304 weights in kernel, but kernel dimensions (1,1) with 128 input channels, 54 output channels and 1 groups were specified. Expected Weights count is 128 11 * 54 / 1 = 6912 Building engine, please wait for a while... [09/02/2021-14:42:33] [E] [TRT] (Unnamed Layer 214) [Convolution]: kernel weights has count 2304 but 6912 was expected [09/02/2021-14:42:33] [E] [TRT] (Unnamed Layer 214) [Convolution]: count of 2304 weights in kernel, but kernel dimensions (1,1) with 128 input channels, 54 output channels and 1 groups were specified. Expected Weights count is 128 11 * 54 / 1 = 6912 [09/02/2021-14:42:33] [E] [TRT] Could not compute dimensions for (Unnamed Layer 214) [Convolution]_output, because the network is not valid. [09/02/2021-14:42:33] [E] [TRT] Network validation failed. Build engine successfully! Segmentation fault (core dumped)
Any suggestions?
Regards.
Hello, have you solved this problem? Can the engine be builded without error?
No, I was not able to. I have opted for different architecture since the authors were not helpful.
Hi,
Thanks for your great work. Looks like gen_wts.py is not working with End-to-end.pth, and would you please provide more details on how to generate .wts file? Appreciate very much if you can show the steps to generate the .engine file.
Regards