ceccocats / tkDNN

Deep neural network library and toolkit to do high performace inference on NVIDIA jetson platforms
GNU General Public License v2.0
717 stars 209 forks source link

Yolo3Detection never allocates input_d (was working a week ago) #162

Closed nightduck closed 3 years ago

nightduck commented 3 years ago

My program creates a Yolo3Detection network, and crashes after calling detNN->update(frames).

The error finally lands in YoloRT.h's function entry_index, where there's a division by zero error.

But after debuggin the the update function, I found at this line:

netRT->infer(dim, input_d);

input_d is a null pointer. What could cause this? Anywhere else I can look for clues? This was working a week ago.

nightduck commented 3 years ago

Confirmed this bug is not present in 86478f9384eef13d68a9406ee12fbcb4df6ab892

ceccocats commented 3 years ago

Hi, we updated the structure of YoloRT.h to have more parameters, this parameters are serialised on the .rt file. So if you haven't rebuilt the RT it is outdated. Rebuild the RT should be sufficient

nightduck commented 3 years ago

Rebuilt the RT file and now it's complaining "this is not yolo3" and throwing a fatal error. In its defense, it's right: I'm feeding it yolo4. But I was previously able to load a yolo4 tensorRT engine into a Yolo3Detection object. Is this no longer supported?

Ignore this, I'm dumb

nightduck commented 3 years ago

Rebuilding the RT file doesn't help. Results of my debugging.

The YoloRT is used during inference. But the "configure" method is never called, so by the time "enqueue" is called during inference, the "h" and "w" fields are still zero, resulting in a division by zero error.

I can't manually call it because the YoloRT instance is hidden inside an abstracted runtime context. Is this still something wrong with the RT file?

nightduck commented 3 years ago

I just followed @ceccocats's advice over and over again, starting over each time until eventually it worked. As for why it didn't work to begin with, I'm going to assume it's a classic case of PEBCAK.