map matching 'quality' test suite

kodonnell commented 7 years ago

As discussed here it'd be good to have a way of testing how 'good' our map matching is by comparing it with known results. (I.e. our map-matching on a GPX track gives a certain route, and how close does that align to the actual known route taken?)

Considerations/comments (to direct a PR):

where do we get data of known routes? @stefanholder pointed out one example - do we want others? Do we want users to be able to test their own data sets?
we should see how map matching performs not only with the given GPX track, but also as we continually down-sample it.
how do we compare with the known route? Compare total distances of each? Or the proportion of the distance which they share common edges on?
how will we add this functionality? I think it is very similar to the Measurement suite (#73) in how it'll be used (and we could, I guess, put it in there if we wanted). Or we could build it into unit tests somehow (e.g. require that the match is sufficiently high.)
do we want to store historical trends in accuracy (as opposed to e.g. just comparing with the last commit) - and if so, how?

karussell commented 7 years ago

where do we get data of known routes?

There are many taxi tracks from NYC http://www.nyc.gov/html/tlc/html/about/trip_record_data.shtml and there are many on OSM itself: http://wiki.openstreetmap.org/wiki/Planet.gpx or GPSies

karussell commented 7 years ago

The performance test suite could be also used for this. E.g. counting the errors and differences of the random routes or using different randomness etc

stefanholder commented 7 years ago

how do we compare with the known route? Compare total distances of each? Or the proportion of the distance which they share common edges on?

We should check to which degree the matched route goes through the same sequence of edge/nodes as the expected route. To this end we could compute the Levenshtein distance as a similarity measure between the expected and the actual node/edge sequence.

stefanholder commented 7 years ago

To this end we could compute the Levenshtein distance

Newson & Krumm suggest a different metric in Chapter 5, which is simpler but probably still sufficient: map matching error = (d_minus + d_plus) / d0

d_minus: total length of road segments that are in the ground truth route but not in the matched route
d_plus: total length of road segments that are in the matched route but not in the ground truth route
d0: length of the ground truth route

With this definition it is possible to have an error greater than 1. But I don't think that this is a real problem.

Another useful variant would be to use the number of road segments for d_minus, d_plus and d0 instead of their length.

kodonnell commented 7 years ago

Agreed that, assuming 'normal' behaviour, it is probably fine. If it's not 'normal' - e.g. non-sequential - then this could be a very poor metric. A Levenshtein approach may handle these cases better.

That said, we may need to write something similar to, but not, Levenshtein. For example, transpositions are allowed (with the same cost as adding/removing). This makes sense in e.g. spelling detection, but probably not in routes - edges have the additional constraint of connectivity (which words largely don't - I think). So we'd either need to not consider transitions, or treat them differently. Similar applies for changing edges. So, it's possibly that our Levenshtein could end up only really considering additions/subtractions (in the 'normal' case) which is equivalent to the @karussell's suggestion.

As an aside - if we do go with Levenshtein, we should weight be edge length, not just counts.

Another thought - what about the area between the curves? We'd probably have time as a dimension (so area includes differences in time), so that even being at the point isn't necessarily good (unless you're also there at the same time). (Actually, the latter is something we'd need to consider in the above approaches too.) In terms of calculating the area, we can probably come up with a suitable proxy similar to numerical integration - e.g. if both route and ground truth had 100 nodes, we'd just sum up the difference (including time) of each node pair.

graphhopper / map-matching

map matching 'quality' test suite #89