can you give me an example about the format of detections

York1996OutLook commented 4 years ago

Found no car detections Found no pedestrian detections Found no cyclist detections Found no van detections Found no truck detections Found no person_sitting detections Found no tram detections i use the same txt with gt，except for 100%，but got 0

cguindel commented 4 years ago

Sure, here you have. As you can see, I tend to use logarithmic scores, but that shouldn't matter.

I don't have any problem using the ground truth as detections; the AP is 100% except for Easy Person_sitting (that doesn't have enough samples for recall discretization).

York1996OutLook commented 4 years ago

root@dell-Precision-Tower-7910:/home/dell/YCC/eval_kitti# evaluate_object exp1 lists Starting evaluation... Results list: lists/lists.txt mkdir: 无法创建目录"results/exp1/plot": 文件已存在 Getting valid images... File loaded Loading detections...

GT STATS

car : 2 pedestrian : 0 cyclist : 1 van : 0 truck : 1 person_sitting : 0 tram : 0

DET STATS

car : 2 pedestrian : 0 cyclist : 0 van : 0 truck : 0 person_sitting : 0 tram : 0 done. Starting 2D evaluation (car) ... Getting detection scores to compute thresholds No GT samples found car evaluation failed. Something happened...

do you know why?

cguindel commented 4 years ago

The evaluation script is not ready to work with only three GT instances, so you'll probably find a lot of problems. Also, it's possible that the GT samples that you are using do not meet the requirements of the Easy level of difficulty (i.e., bounding box height: 40 Px, Max. occlusion level: Fully visible, Max. truncation: 15%). Have you tried it with a proper validation set?

martinrebane commented 4 years ago

@cguindel Thanks for your work! Two quick questions.

You mentioned that this script requires "enough samples for recall discretization". What do you mean by that, how to determine what is enough?

I also looked at the code and this line if (thresholds.size()-1 > 100){ will cause buffer overflow (and hence condition failing) when thresholds.size() is 0. Is this a desired behaviour? Just curious because official script does not have that check and goes on to compute the precision-recall matrix when using exactly the same data (I am indeed using just a handful of examples here for testing).

cguindel commented 4 years ago

By default, 41 recall discretization steps are used in the script, so it would make sense to have at least 41 object instances for each category. I guess it's still possible to compute the curves with fewer samples, but I can't think of a use case where that would be desirable; in general, the number of frames in the validation set should be in the thousands to obtain conclusive results.

On the other hand, the if (thresholds.size()-1 > 100) condition is a not very elegant assertion that was there to avoid some situations where the thresholds were not correctly computed, which could lead to uncontrolled errors later. The value of thresholds.size() shouldn't be 0 in any case because that means that the script couldn't discretize the recall values properly, so that is indeed the desired behavior.

I think it is entirely possible to replace that 100 by a value relative to N_SAMPLE_PTS and include the condition for the negative values explicitly. Still, I would like to try before some extreme cases that could make it fail.

hoangduyloc commented 3 years ago

Found no car detections Found no pedestrian detections Found no cyclist detections Found no van detections Found no truck detections Found no person_sitting detections Found no tram detections i use the same txt with gt，except for 100%，but got 0

Hello @York1996OutLook, Seem I have the same problem as you, I got GT STATS correctly but GT STARTS is all "0". Could you show me how can you solve the problem? Thank you.

cguindel / eval_kitti