Closed daavoo closed 4 years ago
Cool! I will check it later.
Cool! I will check it later.
Hi @grimoire . Thanks for making and sharing this code.
I had encounter some problem getting this to work inside a Jetson Xavier NX I have found a temporal solution but I think that you might have a better solution.
It is related with the batchedNMSPlugin.
It appears that the shape of the numDetections
output is not being properly set.
DeepStream is able to correctly parse the exported TensorRT engine:
INFO nvinfer gstnvinfer.cpp:602:gst_nvinfer_logger:<primary_gie> NvDsInferContext[UID 1]: Info from NvDsInferContextImpl::deserializeEngineAndBackend() <nvdsinfer_context_impl.cpp:1577> [UID = 1]: deserialized trt engine from :{model}.engine
INFO: [Implicit Engine Info]: layers num: 5
0 INPUT kFLOAT input_0 3x300x300
1 OUTPUT kINT32 num_detections 0
2 OUTPUT kFLOAT boxes 200x4
3 OUTPUT kFLOAT scores 200
4 OUTPUT kFLOAT classes 200
But then fails when allocating the buffers for the outputs:
ERROR nvinfer gstnvinfer.cpp:596:gst_nvinfer_logger:<primary_gie> NvDsInferContext[UID 1]: Error in NvDsInferContextImpl::allocateBuffers() <nvdsinfer_context_impl.cpp:1195> [UID = 1]: Failed to allocate cuda output buffer during context initialization
The problem is this snippet (from deepstream-5.0/sources/libs/nvdsinfer/nvdsinfer_context_impl.cpp:1174:
:
for (unsigned int jL = 0; jL < m_AllLayerInfo.size(); jL++)
{
const NvDsInferBatchDimsLayerInfo& layerInfo = m_AllLayerInfo[jL];
const NvDsInferDims& bindingDims = layerInfo.inferDims;
assert(bindingDims.numElements > 0);
size_t size = m_MaxBatchSize *
bindingDims.numElements*
getElementSize(layerInfo.dataType);
if (layerInfo.isInput)
{
/* Reuse input binding buffer pointers. */
batch.m_DeviceBuffers[jL] = m_BindingBuffers[jL];
}
else
{
/* Allocate device memory for output layers here. */
auto outputBuf = std::make_unique<CudaDeviceBuffer>(size);
if (!outputBuf || !outputBuf->ptr())
{
printError(
"Failed to allocate cuda output buffer during context "
"initialization");
return NVDSINFER_CUDA_ERROR;
}
batch.m_DeviceBuffers[jL] = outputBuf->ptr();
batch.m_OutputDeviceBuffers.emplace_back(std::move(outputBuf));
}
bindingDims.numElements
is returning 0
for the numDetections
layer (and the assert is not working).
So the temporary fix I made is:
const NvDsInferBatchDimsLayerInfo& layerInfo = m_AllLayerInfo[jL];
const NvDsInferDims& bindingDims = layerInfo.inferDims;
assert(bindingDims.numElements > 0);
int numElements = bindingDims.numElements;
if (jL == 1) {
numElements = 1;
}
size_t size = m_MaxBatchSize *
numElements *
getElementSize(layerInfo.dataType);
I was planning on reviewing the batchedNMSPlugin and sending a P.R. but you might already know where the problem is.
Hi, Thanks for the bug report.
Guess I should add an extra dim to num_detection. in batchedNMSPlugin.cpp
nvinfer1::DimsExprs BatchedNMSPlugin::getOutputDimensions(
int outputIndex, const nvinfer1::DimsExprs *inputs, int nbInputs, nvinfer1::IExprBuilder &exprBuilder)
{
ASSERT(nbInputs == 2);
ASSERT(outputIndex >= 0 && outputIndex < this->getNbOutputs());
ASSERT(inputs[0].nbDims == 4);
ASSERT(inputs[1].nbDims == 3);
nvinfer1::DimsExprs ret;
switch(outputIndex){
case 0:
ret.nbDims=1;
break;
case 1:
ret.nbDims=3;
break;
case 2:
case 3:
ret.nbDims=2;
break;
default:
break;
}
ret.d[0] = inputs[0].d[0];
if(outputIndex>0){
ret.d[1] = exprBuilder.constant(param.keepTopK);
}
if(outputIndex==1){
ret.d[2] = exprBuilder.constant(4);
}
return ret;
}
Fix this method should give you the right shape (set the ret.nbDims to 2 when outputIndex==0, add a constant to ret.d[1], etc). I will fix it when I have times. Or you can send the P.R if you want to. Any P.R or bug report are welcome!
Thanks for the contribution! It is really cool. Would you please help me updating the README.md of amirstan_plugin and mmetection-to-tensorrt about deepstream support? I don't know much about deepstream.
Thanks for the contribution! It is really cool. Would you please help me updating the README.md of amirstan_plugin and mmetection-to-tensorrt about deepstream support? I don't know much about deepstream.
Sure thing. I will update the READMEs
This P.R. allows the usage of models exported with mmdet2trt inside DeepStream
It adds a CMAKE option
WITH_DEEPSTREAM
.Enabling this option will include a custom output parser for deepstream in the shared object library.
To be latter referenced in the DeepStream configuration file as the following example: