Closed mprabhakarece closed 2 years ago
When I try to do inference by creating one more Dnn object as tk::dnn::DetectionNN *detNN1; and passing my rt file without changing any other components I am getting segmentation fault error.
Hi @mprabhakarece,
I have just fast tried that, and I don't get any error. This is my (crappy) way to modify the demo to test it:
tk::dnn::Yolo3Detection yolo2;
tk::dnn::DetectionNN *detNN2;
std::string net2 = "yolo4_fp32.rt";
switch(ntype)
{
case 'y':
detNN = &yolo;
detNN2 = &yolo2;
break;
case 'c':
detNN = &cnet;
break;
case 'm':
detNN = &mbnet;
n_classes++;
break;
default:
FatalError("Network type not allowed (3rd parameter)\n");
}
detNN->init(net, n_classes, n_batch, conf_thresh);
detNN2->init(net2, n_classes, n_batch, conf_thresh);
Hi @mive93
I ran two models in different terminal: ./demo yolo4_fp32.rt ../demo/yolo_test.mp4 y
I found that the FPS reduced. Is that correct? Is there anything I could do to make FPS not to reduce? I think it's same that run two models in one process or run separately in two processes? Am I right?
Hi @peterlee909 yeah, it's total normal that you have degradation in FPS, you're doing two inferences instead of only one. And yes, it's almost the same as running two separate processes. It depends on your application of course, but the only thing you can do to run more inferences at the same time and not double the latency is exploiting batching, but that means that you have to use the same model.
@mive93 Thank you so much! According to the NVIDIA's information, it's because only one GPU context can be active at a time. Well, that's a little embarrassing.
Closing for inactivity. Feel free to reopen.
Hi,
I have two weight files. And converted them into rt files. Now I am able to run inference of the converted rt files separately. But I want to run both rt files in single application. Kindly let me know the possibilities and procedure.
Thank you in advance.
Regards, Prabhakar M