Open jfgigand opened 8 years ago
Humm, here is what it looks that when I run a test and have a failure:
_________________________________________________________________________ Search: 91 Rue du Moulin __________________________________________________________________________
Search failed
# Search was: 91 Rue du Moulin
# Params was: lon: 2.688706 - limit: 1 - lat: 50.800570
# Expected was: name: 91 Rue du Moulin | postcode: 59299
# Results were:
name | osm_key | osm_value | osm_id | housenumber | street | postcode | city | country | lat | lon | distance
---------------- | ------- | --------- | ------ | ----------- | ------------- | -------- | ------- | ------- | --------- | -------- | --------
91 Rue du Moulin | — | — | — | 91 | Rue du Moulin | 59190 | Caëstre | — | 50.759475 | 2.606369 | —
Not sure why you don't have the carriage returns. May be a locale issue again.
Question 2 is: how can we enable geocoder to tolerate these different spellings?
"av/avenue" may be fixed with a synonym dict to be maybe use when running with --loose-compare
.
About "19B" vs "19 BIS", I'm not sure this can be fixed, they are not same thing: one same street may have both 19 B and 19 BIS, referring to two different addresses.
One other option may be to have post-process normalization for some standards we want to be able to "support", like the AFNOR address one. So we may have something like --normalize=afnor
.
About the '\n' problem: Funny that Python don't trust STDOUT to support "\n" but still prints ANSI color codes...
This problem doesn't occur within a ssh(1) session. It does within a lxc-attach(1) session.
$TERM is not relevant here, as both declare xterm-256color
.
Terminal capabilities are present as Python is able to retrieve terminal width.
mc(1) works well, except the terminal 'resize' event which it does not receive.
The '\n' does not make sense. I never had output problems with any programs behind lxc-attach. Python or a python library is failing somewhere.
Strange that geocoder-tester [...] | cat
will behave the same (broken '\n' behind lxc-attach and working well behind ssh). Even though STDOUT is not a terminal in this case.
About 19B/19 BIS, I didn't know they would be different and representing 2 different addresses... why?
In the former case (19B Rue des Deux Ponts Paris), the returned postal code was incorrect anyway.
But continuing the tests, I have failures on 7T Rue Servandoni
(returned as 7 TER Rue Servandoni
). It is the same address, isn't it?
I also have a case on 3BIS Rue Chopin
, returned as 3 BIS Rue Chopin
.
Is it possible to only check geo coordinates?
About 19B/19 BIS, I didn't know they would be different and representing 2 different addresses... why?
B and BIS are two possible ordinals: sometimes its bis, ter, quater, etc.; sometimes it's A, B, C; sometimes it's something else; and sometimes all together in the same street. ;)
I have failures on 7T Rue Servandoni (returned as 7 TER Rue Servandoni). It is the same address, isn't it?
It may be or may not be. is "7T" from the test case or from the result?
I also have a case on 3BIS Rue Chopin, returned as 3 BIS Rue Chopin.
I guess the test case should be fixed?
Is it possible to only check geo coordinates?
Isn't this the discussion in #14 ? :)
B and BIS are two possible ordinals: sometimes its bis, ter, quater, etc.; sometimes it's A, B, C; sometimes it's something else; and sometimes all together in the same street. ;) It may be or may not be. is "7T" from the test case or from the result?
From test case. Should we transform all bis/ter/... to single letter for comparing? Even if "B" and "BIS" may be different, this probably very rare to have both on the same street (+ same number!) and the test needs to work...
I guess the test case should be fixed?
Only if we establish that using a space is the norm. If not, we may s/([0-9]+) BIS/\1BIS/i
.
More generally, I suggest removing all spaces before comparing, at least on --loose-compare.
Isn't this the discussion in #14 ? :)
Yes it is :)
Hi,
It looks like that geocoder-tester does not specify what property is incorrect from a result. It prints all exepected/result properties as pipe-separated values.
Thus, question 1 is: how can we improve report readability?
From the example below, there is a conflict between
19B
and19 BIS
. The latter is the preferred way for speaking to the end user, while the latter is standard-compliant. The same difference occurs for R/RUE, AV/AVENUE, etc.Question 2 is: how can we enable geocoder to tolerate these different spellings?
Thank you!