luxonis / depthai-ml-training

Some Example Neural Models that we've trained along with the training scripts
MIT License
118 stars 32 forks source link

yolov7 custom tiny model: X_LINK_ERROR | side values? | poor detection with OAK-D #28

Open ZucchiniAI opened 1 year ago

ZucchiniAI commented 1 year ago

Hello, I have converted a custom trained yolov7 tiny model (13 classes and mAP=0.75, 1024x1024) into a blob with http://tools.luxonis.com/ I have used the blob following the instruction of YoloV7_training.ipynb notebook from [depthai-ml-training] repo with a OAK-D camera (connected to USB3 with/or without additional power supply) I have 2 issues:

1) after few seconds I get an error. Why? Traceback (most recent call last): File "main.py", line 51, in pv.prepareFrames() File "/home/mz/Projects/ObjectDetection/depthai/depthai_sdk/src/depthai_sdk/managers/preview_manager.py", line 148, in prepareFrames packet = queue.tryGet() RuntimeError: Communication exception - possible device error/misconfiguration. Original message 'Couldn't read data from stream: 'color' (X_LINK_ERROR)'

2) before the error I see very few and bad detections even if the trained model gave very good results on the test set for static images (mAP~0.75). Why? Please see my Obs below: is that the reason?

Obs: The JSON note in https://github.com/luxonis/depthai-experiments/tree/master/gen2-yolo/device-decoding is not clear to me: I have not changed the Json file of my model which is 1024x1024 as I do not understand WHERE I have to change the "side" entries: I have no side32 or side16 but many of them and all concerning the anchor masks. See attached my json file from the blob conversion: shall I do change something? how and where exactly? best.zip

Note: Values must match the values set in the CFG during training. If you use a different input width, you should also change side32 to sideX and side16 to sideY, where X = width/16 and Y = width/32. If you are using a non-tiny model, those values are width/8, width/16, and width/32.

Thank you in advance Marco

Erol444 commented 1 year ago

Hi @ZucchiniAI , We apologize for the delay, I'll check this issue tomorrow morning. Thanks, Erik

Erol444 commented 1 year ago

Hi @ZucchiniAI , are you using the latest version of depthai? We have tried the same process, and it crashes with old (2.16) depthai, but not with the newest depthai (2.17.4). Thanks, Erik

ZucchiniAI commented 1 year ago

Hello Erik,

1) the X_LINK_ERROR (Point1) happened with the depthai 2.17.0.0. I have now updated to 2.17.4 and indeed this do not happen at all: thank you!

What about my Point2? I see now that the objects are detected with good confidence but with much redundant bounding boxes (e.g. a couple for each object with different sizes, almost in the region of the ground truth) --> it seems to me the NMS does not work well. Could you please tell if the reason is my supposition (see Obs. in my question above) and how/where exactly shall I change the indicated sideX and sideY for my tiny model (1024x1024)?

Thank you in advance! Marco

Erol444 commented 1 year ago

Hi @ZucchiniAI , could you try with different IoU threshold setting?

ZucchiniAI commented 1 year ago

Hello Erik, I was trying as you suggested, but unfortunately the result do not convince me totally :-(

a) a couple of time I got still the X_LINK_ERROR even if env with depthai 2.17.4 was activated .... what is the cause? (I have the USB-C to USB 3 port and the additional power connected): RuntimeError: Communication exception - possible device error/misconfiguration. Original message 'Couldn't read data from stream: 'rgb' (X_LINK_ERROR)'

b) By changing the iou_threshold in the best.yaml in range 0.2 - 0.8 I saw some changes (0.2 being better, but not perfect) but anyhow the detections appear and disappear rapidly and in different positions: sometime is very clear that the object is detected twice even if the model before conversion detected one only. Sometime is difficult to say if is a double bounding box for the same object or intermittent small and large: is there a way to reduce the inference frequency (with my tiny yolo7 I get some 11 fps): My set up is to point the OAK-D camera towards the pc screen where I have the same images which yolov7 before the conversion detected correctly.

c) sorry if I ask but again your hint in https://github.com/luxonis/depthai-experiments/tree/master/gen2-yolo/device-decoding is not clear to me: I have not changed the Json file of my model which is 1024x1024 as I do not understand WHERE I have to change the "side" entries: I have no side32 or side16 but many of them and all concerning the anchor masks. See above attached my json file from the blob conversion: shall I do change something? how and where exactly?

Thank you! Marco

Erol444 commented 1 year ago

Hi @ZucchiniAI , a) That's a very generic error, basically means "something went wrong". Is it sporadic error after X hours? b) Could you share a video? That would help a lot with debugging. c) Anchor masks are are already written in JSON (that you downloaded from the tools.luxonis.com) and will be loaded to the device, I am not sure why you would need to change them?

Thanks, Erik

ZucchiniAI commented 1 year ago

Hi Erik, sorry for the delay:

a) well it happens very often and just after minutes ... (I use a Ubuntu 20 and your original cables): Any way to have more precise log? b) I would do this, but I do not know how: I mean to save the video from OAK with the live-detected objects? Shall I use the managers DepthAI SDK for it? have you an example? c) Sorry I did want to change really but I thougth I had to as my resolution is different from the standard one: By reading again I understand now that if I keep the same resolution after training I do not need to change anything.

Thanks! Marco