Closed pavelgrigoriev closed 2 weeks ago
Can you confirm that it works as expected if you export the ONNX model with a fixed batch_size
of 2?
Can you confirm that it works as expected if you export the ONNX model with a fixed
batch_size
of 2?
Now the model inference is happening, but there is an error calling by the following line in engine.cpp:
if (input.size() != 1 || input[0].size() != 1)
The program terminates with the following error message: terminate called after throwing an instance of 'std::logic_error' what(): The feature vector has incorrect dimensions!
This occurs when applying the transformOutput function.
`std::vector
// Simulate running inference on x lines
for (int i = 0; i < 1; ++i) {
// Start measuring time
auto start = std::chrono::high_resolution_clock::now();
auto succ = m_trtEngine->runInference(input, featureVectors);
// Stop measuring time
auto end = std::chrono::high_resolution_clock::now();
// Calculate the duration for this inference
std::chrono::duration<double> duration = end - start;
double inferenceTimeMs = duration.count() * 1000.0;
totalInferenceTimeMs += inferenceTimeMs;
if (!succ) {
throw std::runtime_error("Error: Unable to run inference on line " + std::to_string(i + 1));
}
}
qDebug() << "Total inference time: " << totalInferenceTimeMs << " ms";
auto outputDims = m_trtEngine->getOutputDims();
qDebug() << "Output dimensions: ";
for (const auto& dim : outputDims) {
qDebug() << dim.d[0] << "x" << dim.d[1] << "x" << dim.d[2] << "x" << dim.d[3];
}
if (outputDims.size() != 1 || outputDims[0].d[0] != batchSize || outputDims[0].d[2] != 1 || outputDims[0].d[3] != 1280) {
throw std::runtime_error("Error: Unexpected output dimensions.");
}
std::vector<float> featureVector;
Engine<float>::transformOutput(featureVectors, featureVector);
std::vector<std::vector<int>> all_predicted_classes(batchSize, std::vector<int>(1280));
for (int b = 0; b < batchSize; ++b) {
for (int i = 0; i < 1280; ++i) {
float max_val = featureVectors[b][0][i]; // Start with the first class
int max_idx = 0;
for (int j = 1; j < 6; ++j) { // Iterate over the 6 classes
if (featureVectors[b][0][j * 1280 + i] > max_val) {
max_val = featureVectors[b][0][j * 1280 + i];
max_idx = j;
}
}
all_predicted_classes[b][i] = max_idx;
}
}
std::vector<Object> detectedObjects;
return detectedObjects;
}`
It seems that the issue might be related to how the input tensor shapes are set during inference. By modifying the lines around L72 in the following way:
m_context->setInputShape(m_IOTensorNames[i].c_str(),
inputDims);
to
int inputIndex = m_engine->getBindingIndex(m_IOTensorNames[i].c_str());
if (!m_engine->bindingIsInput(inputIndex)) {
spdlog::error("Binding {} is not an input!", inputIndex);
return false;
}
// Set the binding dimensions for the input
if (!m_context->setBindingDimensions(inputIndex, inputDims)) {
spdlog::error("Failed to set binding dimensions for input {}", inputIndex);
return false;
}
I get featureVectors
of shape 2x1x7680
, which can be reshaped to 2x6x1280
. I believe this is the output shape you’re expecting.
I'm not entirely sure yet why this change works differently from the previous implementation, but I’ll look into it further. Could you please try this on your end and see if it resolves the issue?
I tried make like this
nvinfer1::Dims4 inputDims = {batchSize, dims.d[0], dims.d[1], dims.d[2]};
m_context->setInputShape(m_IOTensorNames[i].c_str(),
inputDims); // Define the batch size
int inputIndex = m_engine->getBindingIndex(m_IOTensorNames[i].c_str());
if (!m_engine->bindingIsInput(inputIndex)) {
std::cout << "Binding {} is not an input!", inputIndex;
return false;
}
// Set the binding dimensions for the input
if (!m_context->setBindingDimensions(inputIndex, inputDims)) {
std::cout << "Failed to set binding dimensions for input {}", inputIndex;
return false;
}
or
nvinfer1::Dims4 inputDims = {batchSize, dims.d[0], dims.d[1], dims.d[2]};
int inputIndex = m_engine->getBindingIndex(m_IOTensorNames[i].c_str());
if (!m_engine->bindingIsInput(inputIndex)) {
std::cout << "Binding {} is not an input!", inputIndex;
return false;
}
// Set the binding dimensions for the input
if (!m_context->setBindingDimensions(inputIndex, inputDims)) {
std::cout << "Failed to set binding dimensions for input {}", inputIndex;
return false;
}
m_context->setInputShape(m_IOTensorNames[i].c_str(),
inputDims); // Define the batch size
but no effect for me
The code on the i-80
branch includes these changes, resulting in the following output:
./build/run_inference_benchmark --onnx_model ./models/model.onnx
[2024-08-26 11:39:34.462] [warning] LOG_LEVEL environment variable not set. Using default log level (info).
[2024-08-26 11:39:34.478] [info] Engine name: model.engine.Orin.fp16.2.2
[2024-08-26 11:39:34.478] [info] Searching for engine file with name: ./model.engine.Orin.fp16.2.2
[2024-08-26 11:39:34.478] [info] Engine found, not regenerating...
[2024-08-26 11:39:34.478] [info] Loading TensorRT engine file at path: ./model.engine.Orin.fp16.2.2
[2024-08-26 11:39:34.556] [info] Loaded engine size: 11 MiB
[2024-08-26 11:39:34.583] [info] [MemUsageChange] TensorRT-managed allocation in engine deserialization: CPU +0, GPU +10, now: CPU 0, GPU 10 (MiB)
[2024-08-26 11:39:34.584] [info] [MemUsageChange] TensorRT-managed allocation in IExecutionContext creation: CPU +0, GPU +3, now: CPU 0, GPU 13 (MiB)
[2024-08-26 11:39:34.594] [info] Attempting to reshape the matrix to have 40 channels, 1 height, and 1280 width
[2024-08-26 11:39:34.594] [info] Reshaped desired matrix to have 40 channels, 1 height, and 1280 width
[2024-08-26 11:39:34.594] [info] Warming up the network...
[2024-08-26 11:39:34.870] [info] Feature vectors shape: 2x1x7680
[2024-08-26 11:39:34.870] [info] Running benchmarks (1000 iterations)...
[2024-08-26 11:39:36.693] [info] Benchmarking complete!
[2024-08-26 11:39:36.693] [info] ======================
[2024-08-26 11:39:36.693] [info] Avg time per sample:
[2024-08-26 11:39:36.693] [info] Avg time per sample: 0.9115 ms
[2024-08-26 11:39:36.693] [info] Batch size: 2
[2024-08-26 11:39:36.693] [info] ======================
[2024-08-26 11:39:36.693] [info] Batch 0, output 0
[2024-08-26 11:39:36.693] [info] 4.476562 4.285156 4.230469 4.371094 4.699219 4.828125 4.921875 4.167969 4.406250 4.726562 ...
[2024-08-26 11:39:36.693] [info] Batch 1, output 0
[2024-08-26 11:39:36.693] [info] 4.476562 4.285156 4.230469 4.371094 4.699219 4.828125 4.921875 4.167969 4.406250 4.726562 ...
Does this give the results you're expecting?
No result that i expected(
Output dimensions: # 40 x 6 x 1 x 1280 terminate called after throwing an instance of 'std::logic_error' what(): The feature vector has incorrect dimensions!
Didn't you mention that the output should be (batch_size, 6, 1, 1280)
? In the case on branch i-80
the batch_size
is set to 2
.
I just tried different barchsize 2, 20,40, and convert onnx to 2, 20, 40
In the example above I ran it with your original model with dynamic batch size.
Total inference time: 26.1466 ms Output dimensions: -1 x 6 x 1 x 1280 terminate called after throwing an instance of 'std::logic_error' what(): The feature vector has incorrect dimensions!
https://github.com/pavelgrigoriev/tensorrt-cpp-api/tree/main my code structure
I’m not sure why it’s not working with your structure. I’ll keep the i-80
branch active for now in case you want to use it for debugging. It’s set up with dynamic batch size for the model you provided.
I’m not sure why it’s not working with your structure. I’ll keep the
i-80
branch active for now in case you want to use it for debugging. It’s set up with dynamic batch size for the model you provided.
Okey, thank you very much anyway!
Description:
I am working on a custom model using a class I named HypNet, which is based on the original YoloV8 implementation and utilizes TensorRT for inference. The model's input dimensions are reshaped to [40, 1, 1280]. When I attempt to run the model with a batch size of 2 (or any batch size greater than 1), I encounter the following error:
However, if I set both optBatchSize and batchSize to 1, the error does not occur, and I can successfully obtain results.
Initialization with Batch Size 2 (Error Occurs):
Initialization with Batch Size 1 (Works as Expected):
Additional Information:
Expected Output: When using a batch size of 2, I expect the output to be in the format of [2, 6, 1280]. Current Setup: Repository Version: 5.0 (for the implementation of TensorRT) Custom Model Input Dimensions: [40, 1, 1280] I have removed the cv::cuda::split(batchInput[img], input_channels); line from engine.h because my model has 40 channels. By doing this, I simulate having multiple batches:
model.zip