Closed bennaa closed 5 months ago
Hi @bennaa,
Thanks for the thorough description of your problem! I'll look into it. Could you please share the exported .xml, .bin and .blob files with me?
Best, Jan
hi @HonzaCuhel,
sorry for the delay.
I cannot directly share the files here because the format is not supported, you can find the files in this drive folder.
No problem at all. Thank you for sharing the files. I'll look into it and will get back to you as soon as I find something.
Best, Jan
Hi @bennaa,
I'm sorry for not getting back to you sooner. I have looked into it and found several things that I want to share with you:
NeuralNetwork
node (instead of YoloDetectionNetwork
) in the DAI Pipeline must be used. YoloDetectionNetwork
expects the raw output from each model's channel (without the bboxes already decoded). Your model outputs a single tensor. This outputted tensor must be passed into the NonMaxSuppression
operation (a variant of this) to get the final predictions.8
. However, the exported model predicts 80
classes. Was this just a typo?I want to run the inference and possibly export the model by myself. But for that I need to ask you, could you please share with us these details about the training:
Furthermore, could you please additionally share the model's weights (the .pt file) and the exported .onnx model?
Thank you.
Best, Jan
Hi @HonzaCuhel,
I shared a toy sample based on Nano model and trained on coco8. I also updated the drive folder with the .pt and .onnx files. Inside the folder there is also the .yaml file with the dataset specification and classes.
Great, thank you very much!
Hi @bennaa,
I exported the model without the bounding box decoding part to use it with the YoloDetectionNetwork
in the DepthAI pipeline. However, when I tested the exported .blob model in an application, it detected nothing. I also tried to infer both the ONNX models (your version and the one I created), but it also didn't detect anything (as shown in the attached image).
Have you tried the inference of the trained model? Has it detected some objects? If so, which?
You can find the exported models and the application with the DepthAI pipeline here.
Best, Jan
Hi @HonzaCuhel,
You are right, the model I shared is somehow bugged. I updated all the files in the drive folder with a working model.
Using the .pt file I can detect objects, like in this image.
May I ask you how did you exported the model without the bounding box decoding part to use it with the YoloDetectionNetwork in the DepthAI pipeline?
Thanks for your support, Andrea
Hi @bennaa,
Perfect! Now, the model detects objects (as shown in the screenshot - an inference of the newly exported ONNX model).
I updated the exported model files in the drive folder I've shared before. I also added the Jupyter notebook that I used to convert the model. Basically, I took the code from our tools and made the same changes you did (changed the number of channels and removed the --reverse_input_channels
).
If you have any additional questions, please do not hesitate to ask me. I'm more than happy to help.
Best, Jan
Hi @HonzaCuhel,
I was testing the blob and everything works smoothly! Thank you very much for your support!
Best, Andrea
Following the blog post at https://discuss.luxonis.com/d/3477-yolo8-grayscale-conversion-error, and following jakaskerl suggestion, I post this here.
I trained a custom grayscale Yolo8 model using the ultralytics library and I want to use it on the OAK device. The only difference in the model is that is modified to accept 1 channel, instead of 3 channels, images.
When I try to convert it from .pt to .blob usingLuxonis Tools it returns Error while converting to onnx.
I also tried to:
but when i run the pipeline it returns this error [DetectionNetwork(3)] [error] Mask is not defined for output layer with width '3549'. Define at pipeline build time using: 'setAnchorMasks' for 'side3549'. despite Yolo8 is mask less.
After some more tests, here is what I found:
I started by analyzing the code of the online tool Luxonis Tools, which expects .pt files with 3-channel inputs. The main changes I had to make are at line 58 in export_yolov8.py: im = torch.zeros(1, 3, *self.imgsz[::-1])#.to(device) # image size(1,3,320,192) BCHW iDetection
And at line 67 in exporter.py: '--reverse_input_channels '
I then checked the input and output sizes of the exported IR file using Netron. Next, I compiled the XML into a .blob file using the OpenVINO compile_tool.exe. By debugging the compile_tool code, I confirmed that the compiled blob has the correct input and output layers and sizes.
Finally, I tested adding the blob in the pipeline with different DepthAI nodes:
I then analyzed the source code in C++, but since it's a runtime error, I believe I need to check the code after the RPC call in DeviceBase.cpp. However, I don't think that code is available publicly, so I've run out of ideas.
I also believe it's not possible to send the neural network a fake 3-channel image that is just a view of the same grayscale channel repeated three times (e.g., using np.stack((grayscale_image,) * 3, axis=-1), which actually occupies only the memory of the grayscale image and builds a 3-channel view).
The overall changes in the blob creation seem to be very small (basically remove the hardcoded part where 3 channels were expected), do you think the changes in the sensor code could be equally minimal and could be done in one of the next release?
Thank you for your help!