Closed cdondrup closed 9 years ago
Good Job! I've always wondered why this is not a thing.
Oh, I thought it was a PR. I should not write on github after all-nighters =)
Also, this is a standard package of which I just didn't know what it does and that it exists until @marinaKollmitz from the lamor school told me. But thank you for the praise ;)
Now this is funny! Without being aware of your commits, I implemented almost exactly the same mechanism in SPENCER 9 days ago. So you beat us by one day ;)... we are filtering on the detection level, though, as opposed to the measurement level... which allows us, in theory, to e.g. also filter detections from the HOG detector against the static map. But your solution is computationally more efficient. I also didn't know about the existence of the laser_filters package.
In a narrow indoor lab environment with tables, chairs etc. causing a lot of false positives, the filtering step increased our MOTA from -1.25 to +0.6. My impression is that none of the 2D laser detectors generalize well. And if you re-train them on your environment, you essentially perform a background subtraction... for which you could also use your existing static map in the first place.
@tlind we should talk more ;-)
I think one should do this on one's training data before using it for training. Do you know if the standard leg detectors did that?
Otherwise, I agree with your last point, but didn't manage to fully convince @Pandoro :-)
@marc-hanheide @cdondrup Yes indeed! Sadly I didn't make it to the summer school in Lincoln due to scheduling constraints, but maybe we'll see each other at IROS?
@lucasb-eyer The laser detectors are usually trained on manually annotated groundtruth data, where the annotation labels are "foreground" (person) and "background". So there is no need to have an occupancy grid map for training if the annotations are of good quality, as both essentially serve the same purpose.
One problem with a laser at waist height such as in SPENCER, where you often do not perceive two individual leg segments, is that it can be really hard to discriminate between classes if the environment is diverse enough. Even as a human, I recently thought e.g. a trash bin was a person in 2D laser data until it entered the camera's FOV. Due to the sparsity of the data + large pose variation + people walking with luggage etc., the classes are just not fully separable. Especially if you take partial occlusions into account, and you only see half of the person's (or trash bin's) laser echoes.
I see, and the detector-internal "tracking" part helps work around this a bit, but for standing people I guess it'll be a hard problem. Anyways, thanks for the details!
This filters out the laser measurements that correspond to obstacles in the static map and then passes only the remainder to the leg_detector, removing a lot of the static false positives like tables.