GIScience / openrouteservice

🌍 The open source route planner api with plenty of features.
https://openrouteservice.org
GNU General Public License v3.0
1.43k stars 390 forks source link

Parse non-integer (speed) values correctly #1233

Open koebi opened 1 year ago

koebi commented 1 year ago

This is related to this thread in the forum.

Here's what I did

According to the forum thread, I changed the maxspeed of the way in question to 10. Then, I ran ors and successfully retrieved a route with a duration according to the changed maxspeed.

I then retried with a maxspeed of 10.1


Here's what I got

I got the default duration according to the default maxspeed on the way in question..


Here's what I was expecting

Correct parsing of the maxspeed


Here's what I think could be improved

The method in question is this one (or the respective method in the 4.0 upgrade branch).

Changing Integer.parseInt() to Double.parseDouble() (or sth) would probably fix this. This will be delayed until the currently ongoing work is done.

This might not be restricted to only speed values, other tags and values might be affected by this.

aoles commented 1 year ago

I'm not sure whether addressing this should be our focus. To me it sounds like a minor data problem. According to OSM wiki:

The maxspeed=* tag is used on ways to define the maximum legal speed limit for general traffic on a particular road.

As such it is rather uncommon for the maxspeed tag to take non-integer numeric values even though it technically can. A query to taginfo seems to confirm this. Globally out of ca. 16m ways tagged with maxspeed roughly 300 of them take non-integer numeric values which is only 0.002%.

library(jsonlite)

## total number of ways tagged with maxspeed
url <- "https://taginfo.openstreetmap.org/api/4/key/values?key=maxspeed&filter=ways&lang=en&sortname=count&sortorder=desc&format=json_pretty"
res <- fromJSON(url)
n <- sum(res$data$count)
n
#> [1] 15860267

## maxspeed values which contain "."
url <- "https://taginfo.openstreetmap.org/api/4/key/values?key=maxspeed&filter=all&lang=en&sortname=count&sortorder=desc&query=.&qtype=&format=json_pretty"
res <- fromJSON(url)
data <- res$data
valid <- grepl("[0-9]\\.[0-9]", data$value)
real_numbers <- sum(data$count[valid])
real_numbers
#> [1] 307

# percentage
signif(real_numbers / n * 100, 1)
#> [1] 0.002

Created on 2023-01-18 with reprex v2.0.2

koebi commented 1 year ago

I don't think that that ever came up with OSM data directly. I can, however, easily see this coming up in custom settings such as the one in the referenced forum post, mainly for mph vs kmh reasons.

I do agree that this is not of high priority, and it might go away with the update?

aoles commented 1 year ago

I don't think that that ever came up with OSM data directly. I can, however, easily see this coming up in custom settings such as the one in the referenced forum post, mainly for mph vs kmh reasons.

While I understand your concerns, I tend to disagree. If this is actually a unit conversion issue then the numeric values should probably be entered in their native units which can be specified in the value as well. As long as the numeric value represents a legal speed limit then this shouldn't be an issue at all. I'm not sure what the example in the forum was trying to achieve, but I suspect this has been more of a hack and possibly even misuse of the purpose of the OSM maxspeed tag. I'm skeptical whether the ORS team should invest their resources and increase maintenance burden to accommodate such solutions.

I do agree that this is not of high priority, and it might go away with the update?

It won't as parseSpeed methods in GH 4.0 still relies on Integer.parseInt.

koebi commented 1 year ago

To be fair, while the issue never came up with OSM data directly, there are places with non-integer speed limits. I cannot post pictures on gh for some reason, but I found three examples via google streetview, all tagged in osm…

…and all in the US :DD

aoles commented 1 year ago

Fair enough. To be honest I didn't really look into individual cases as I assumed them to be OSM data quality-problems rather than true values reflecting actual sign-posted limits. I hope you didn't mean these ones though: 🤣

image