Open chowkamlee81 opened 5 years ago
hi @chowkamlee81,
Can you debug content in box1 and box2 in this position : https://github.com/ghimiredhikura/Complex-YOLO-V3/blob/master/utils/utils.py#L256.
I didn't got such error so I couldn't reproduce this error. Please make sure your dataset is well structured. Did you got this error in the middle of training or at the beginning?
Best, Deepak
Dataset i used KITTI and i didn't any issues in terms of using KITTI GT which i followed your conventional hierarchy. Code is crashing at the begining of training only after few seconds. Box is not having any centroid,area,boundary. Hence might be crashing. Kindly help In code https://github.com/ghimiredhikura/Complex-YOLO-V3/blob/master/utils/utils.py#L21
Code is throwing Nan values at the location above.
https://github.com/ghimiredhikura/Complex-YOLO-V3/blob/master/models.py#L259.
Because of these issues, NAN values are coming and impacting the performance. Kindly help
Just now made a batch size of 1. It starts working. But for a batch of 4 it starts crashing. Is it the same with you. Kindly suggest to resolve
I trained with batch size 6 and 8. It was working smoothly. I will train with batch size 4 and if I got same error message, I will bebug and let you know.
Thanks, will wait for your updates.
On Tue 20 Aug, 2019, 5:36 PM Deepak Ghimire, notifications@github.com wrote:
I trained with batch size 6 and 8. It was working smoothly. I will train with batch size 4 and if I got same error message, I will bebug and let you know.
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/ghimiredhikura/Complex-YOLO-V3/issues/2?email_source=notifications&email_token=AHRBU3MMUG5B6WWD22MGIMDQFPM4JA5CNFSM4INRJLA2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD4WBSTQ#issuecomment-522983758, or mute the thread https://github.com/notifications/unsubscribe-auth/AHRBU3I7D5YP5QTKBATBTW3QFPM4JANCNFSM4INRJLAQ .
For a batch size of 6/8, training code crashes with a message below: raise ValueError("Null geometry supports no operations") ValueError: Null geometry supports no operations
It is purely because of NaN values in CNN network for batch_size>1.
Also the code crashes even for batch_size=1 also.
Hence unable to train system using this repo.
hi @chowkamlee81,
I trained with batch size 4 both yolov3 and tiny_yolov3 network and it is working smoothly. Below is the log of tiny-yolov3 training.
---- [Epoch 113/300, Batch 608/1441] ----
+------------+--------------+--------------+
| Metrics | YOLO Layer 0 | YOLO Layer 1 |
+------------+--------------+--------------+
| grid_size | 19 | 38 |
| loss | 0.114999 | 0.198011 |
| x | 0.003669 | 0.003421 |
| y | 0.007138 | 0.015607 |
| w | 0.001676 | 0.001374 |
| h | 0.003273 | 0.002913 |
| im | 0.004667 | 0.004586 |
| re | 0.002661 | 0.002963 |
| conf | 0.091880 | 0.166978 |
| cls | 0.000035 | 0.000168 |
| cls_acc | 100.00% | 100.00% |
| recall50 | 0.933333 | 1.000000 |
| recall75 | 0.800000 | 0.933333 |
| precision | 0.466667 | 0.600000 |
| conf_obj | 0.962188 | 0.954067 |
| conf_noobj | 0.000444 | 0.000585 |
+------------+--------------+--------------+
Total loss 0.3130098879337311
---- ETA 0:06:21.502111
I am not being able to fix your problem as I can;t reproduce it.
The error you are getting I think comes from problem in dataset. There is script for checking dataset. Did you run it?
python check_dataset.py
I ran python check_dataset.py. It displayed below:
Load TRAIN samples from /media/chidanand/BE120667120624CD/KITTI/training Done: total TRAIN samples 1414
Still unable to train
Does it also display the point cloud BEV image with bbox on it?
Yeah , It also displays bounding box on bird eye view
Hi @chowkamlee81,
check this one: #3. He made it work. Can you also check with latest version of pytorch.
Best, Deepak
I even tried with pytorch 1.1/1.2 version. Still the problem persists..
Error is causing because https://github.com/ghimiredhikura/Complex-YOLOv3/blob/master/models.py#L266 where for i=82, x=nan appears. Crashing occurs because tensors are getting corrupted. Kindly mention which version of pytorch you are using so that i can retry this option
Python 3.7.3, PyTorch 1.1.0
File "/home/Tracking/Complex-YOLO-V3-master/models.py", line 190, in forward ignore_thres=self.ignore_thres, File "/home/Tracking/Complex-YOLO-V3-master/utils/utils.py", line 383, in build_targets rotated_iou_scores = rotated_box_11_iou_polygon(pred_boxes[b, best_n, gj, gi], target_boxes, nG) File "/home/Tracking/Complex-YOLO-V3-master/utils/utils.py", line 254, in rotated_box_11_iou_polygon iou = rotated_bbox_iou_polygon(bbox1, bbox2).squeeze() File "/home/Tracking/Complex-YOLO-V3-master/utils/utils.py", line 278, in rotated_bbox_iou_polygon return compute_iou(bbox1[0], bbox2) File "/home/Tracking/Complex-YOLO-V3-master/utils/utils.py", line 34, in compute_iou ious = box.intersection(b).area / box.union(b).area File "/home/.local/lib/python3.7/site-packages/shapely/geometry/base.py", line 542, in intersection return geom_factory(self.impl['intersection'](self, other)) File "/home/.local/lib/python3.7/site-packages/shapely/topology.py", line 64, in call self._validate(this) File "/home/.local/lib/python3.7/site-packages/shapely/topology.py", line 18, in _validate raise ValueError("Null geometry supports no operations") ValueError: Null geometry supports no operations