cguindel / eval_kitti

Tools to evaluate object detection results using the KITTI dataset.
57 stars 23 forks source link

stopped with segmentation fault #8

Closed jkstyle2 closed 4 years ago

jkstyle2 commented 5 years ago

First, thanks for your amazing work! While it's really helpful, I really need your help. Here are the summary I'm currently seeing.


GT STATS

car : 1312 pedestrian : 287 cyclist : 99 van : 139 truck : 35 person_sitting : 4 tram : 11

DET STATS

car : 1186 pedestrian : 200 cyclist : 70 van : 136 truck : 33 person_sitting : 5 tram : 11 done. Starting 2D evaluation (car) ... Getting detection scores to compute thresholds Computing statistics Getting detection scores to compute thresholds Computing statistics Getting detection scores to compute thresholds Computing statistics save results/leaderboard/plot/car_detection.txt sh: 1: gnuplot: not found sh: 1: gnuplot: not found Error: /undefinedfilename in (car_detection.eps) Operand stack:

Execution stack: %interp_exit .runexec2 --nostringval-- --nostringval-- --nostringval-- 2 %stopped_push --nostringval-- --nostringval-- --nostringval-- false 1 %stopped_push Dictionary stack: --dict:961/1684(ro)(G)-- --dict:0/20(G)-- --dict:78/200(L)-- Current allocation mode is local Last OS error: No such file or directory GPL Ghostscript 9.26: Unrecoverable error, exit code 1 sh: 1: pdfcrop: not found save results/leaderboard/plot/car_orientation.txt sh: 1: gnuplot: not found sh: 1: gnuplot: not found Error: /undefinedfilename in (car_orientation.eps) Operand stack: ... ... Execution stack: %interp_exit .runexec2 --nostringval-- --nostringval-- --nostringval-- 2 %stopped_push --nostringval-- --nostringval-- --nostringval-- false 1 %stopped_push Dictionary stack: --dict:961/1684(ro)(G)-- --dict:0/20(G)-- --dict:78/200(L)-- Current allocation mode is local Last OS error: No such file or directory GPL Ghostscript 9.26: Unrecoverable error, exit code 1 sh: 1: pdfcrop: not found done. Starting bird's eye evaluation (pedestrian) ...Getting detection scores to compute thresholds Segmentation fault (core dumped)

When it starts evaluating bird's eye detection, a critical segmentation fault occurs, and it's all stopped. It has been happening in other similar projects, but I can't figure out what the real issues are. It will be so helpful if you could give me some piece of hints. Thanks in advance.

Apart from this issue, I have one more question. How can we decide the 16th value 'score/confidence' in the label? This time, I just put any random number, but I have no idea of calculating the value. Does it come from 2D Detector which usually outputs confidence probability by softmax classification?

Again, thanks for you work, and looking forward to seeing your reply!

cguindel commented 5 years ago

Hi @jkstyle2,

I am addressing here this issue and #3 as well. The problem you have come across is usually due to a wrong detection file, where some of the fields do not make sense. In this case, it may be related to the score field.

Following the discussion in #3, the 16th value (i.e., the score) must be a confidence value for detection; you can compute it however you want (it depends on your detection method), but higher values must mean higher confidence in the detection. Also, it should be roughly linear. The way you obtain the score is a decision that you must make: you can use the classification score from the softmax alone (as is usually done), or you can combine it somehow with the bounding box regression scores, for instance.

If you assign random values to the detections, or simply use 1 for every detection, you won't be getting meaningful results from this script since you won't get a proper precision-recall curve and, therefore, it won't make sense to compute the average precision (AP) or the average orientation similarity (AOS), which are the measures used here for evaluation.

Probably, if you fix the score, the segmentation fault error will be solved. Check also if the 3D dimensions/locations in the detection file make sense; that is, dimensions are greater than 0, locations are points in front of the vehicle, and so on. That is usually another source of error.

Try to work on the confidence value and let me know if that fixes the error.

jkstyle2 commented 5 years ago

Thanks for the quick and detailed comment ! You're very helpful :)

As you mentioned, I filled the 16th value with the 2D Detector's classification score, but unfortunately it didn't solve the issue. I also tried it as a fixed value, but it was same. Obviously I checked if the rest of the fields are valid, and none of them has invalid value. For an another try, I investigated the original code using 'gdb' debugging tool, and I noticed when I uncomment the lines of evaluating bird's eye view, it goes well without the segmentation fault. As the evaluation of bird's view metric was not my interest, I just passed it and got every metric from the code.

And now, I've got some doubtful results from some points. First, as you can see the result of 2D detection below, the 'easy' precision is higher than moderate/hard, which are not generally acceptable. It happens same whenever I try with other prediction results. Also, I wonder why the 'easy' graph doesn't drop when the recall value goes to 1. And all the graphs don't have 1 even when the recall=0, and they seem very linear. car_detection

Secondly, I changed the predicted 2D_BOX, 3D_Dimensions, Orientation to Ground_truth value to check if it gives much better result, but it rather decreases the precision as below. car_detection As long as I understand, these precision values should be 1, because there must be no error.

Sorry for many questions, but hopefully you could help me. Please share your opinions and correct me if I'm mistaken with something.

cguindel commented 5 years ago

Sorry for the delay. Let's go step by step: 1) I concede it is not very usual to get better precision results for Moderate/Hard levels than for Easy, but I think it is still possible. Especially if you are using a small set of frames, which seems to be the case (e.g., only 1312 ground-truth cars). The reasoning is as follows: each difficulty level contains all the samples from the previous level and, also, some new ones (which are supposed to be more challenging). If your algorithm can predict all the "new" samples correctly, you may get the behavior you are experiencing. As I said, it is not very usual, but it can happen with a small set of samples. On the other hand, precision doesn't have to reach 1 in all cases (if you have some false positives for every threshold, i.e., false positives with high score), neither the curve has necessarily to drop when recall goes to 1 (if your method reaches recall=1 for a given threshold, that is, zero false negatives). To sum up, I think every phenomenon you are observing might be due to the size of the selected validation dataset, which is not representative enough.

2) I think I cannot help you here without more information from your side. If I understood you correctly, you are replacing the bounding box coordinates of the detections by the real ground-truth values, but I don't know how you are assigning the ground-truth values to the detections (by proximity? by IoU value?). I can assure you that if you use the ground-truth files as a detection result (giving each sample a score), you will get what you were expecting:

car_detection

cguindel commented 4 years ago

Closed for inactivity (also it does not look like a code problem)