osmlab / osmlint

An open source suite of js validators for OpenStreetMap data, to identify common geometry and metadata problems at scale.
ISC License
84 stars 10 forks source link

Validator for punctuation characters #234

Open daniel-j-h opened 7 years ago

daniel-j-h commented 7 years ago

Can we write a validator for punctuation tokens in tags? I don't know how high the false positive rate would be but how often do punctuation characters such as

'!"#$%&\'()*+,-./:;<=>?@[\\]^_`{|}~'

occur.

If we can't do this on the name tag on all ways can we at least check it on a subset of all ways e.g. highways, ramps, etc. and then check specific tags such as name, ref, destination - basically all tags used for generating guidance instructions from?

What I'm mostly concerned about is routing engines announcing weird punctuation characters in guidance instructions to the user.

cc @srividyacb

1ec5 commented 7 years ago

Note that some of these characters are expected in street names:

A reasonably sophisticated text-to-speech engine can handle these punctuation characters well.

If we can't do this on the name tag on all ways can we at least check it on a subset of all ways e.g. highways, ramps, etc. and then check specific tags such as name, ref, destination - basically all tags used for generating guidance instructions from?

It makes sense to limit this check to street names. Business names can have a wider variety of punctuation (@ $ !).

dannykath commented 7 years ago

@Rub21 https://github.com/osmlab/osmlint/pull/256

Rub21 commented 7 years ago

Thank you @dannykath, I merge the branch into master, now I am working to release the detection into to-fix.

Rub21 commented 7 years ago

to-fix: task https://osmlab.github.io/to-fix/#/task/punctuationcharactersinhighways

Rub21 commented 7 years ago

Working around those issues, I found some Telenav users are fixing those issues to in the highways: e.g: https://osmlab.github.io/osm-deep-history/#/way/206914585 , https://osmlab.github.io/osm-deep-history/#/way/8941557

@daniel-j-h, I found some issues which you could suggest to fix them.

1ec5 commented 6 years ago

As mappers attempt to map both destination:ref and destination:ref:lanes simultaneously, occasionally destination:ref has wound up with | instead of ; to delimit each destination (example). Can this validator catch those issues?

dannykath commented 6 years ago

@Rub21