As part of this process we did some wider acceptance test checks and diff'd them against the current baseline.
One change which was identified was this query (at partial completion "grolmanstrasse 51, charlottenburg") which identifies the Berlin borough charlottenburg as a street.
grolmanstrasse 51, charlottenburg, berlin
-FFFFFFFFFFFFFFFF0000000000000000000000000
+FFFFFFFFFFFFFFFF0000000000000000FFFF0FFF0
I would like to see if we can find a better way of handling the ambiguities between German and Dutch for the -burg suffix.
note: the correct solution is also being generated, but they both score the same, this scoring is based on matched token length so a robust fix would need to work equally well in cases where the len(street) < len(borough) as len(street) > len(borough) and len(street) == len(borough)
Today we are merging https://github.com/pelias/api/pull/1565 which brings a bunch of
pelias/parser
changes intopelias/api
.As part of this process we did some wider acceptance test checks and diff'd them against the current baseline.
One change which was identified was this query (at partial completion
"grolmanstrasse 51, charlottenburg"
) which identifies the Berlin boroughcharlottenburg
as a street.This was likely introduced in the recent NL work https://github.com/pelias/parser/pull/126.
I would like to see if we can find a better way of handling the ambiguities between German and Dutch for the
-burg
suffix.note: the correct solution is also being generated, but they both score the same, this scoring is based on matched token length so a robust fix would need to work equally well in cases where the
len(street) < len(borough)
aslen(street) > len(borough)
andlen(street) == len(borough)