sopittha / pointz

Automatically exported from code.google.com/p/pointz
0 stars 0 forks source link

Improve geocoder #17

Closed GoogleCodeExporter closed 9 years ago

GoogleCodeExporter commented 9 years ago
Continue to tune and improve geocode performance. 

Better recognition of datelines and non boas-city explicit addresses.

Original issue reported on code.google.com by michaels...@gmail.com on 8 Apr 2009 at 2:35

GoogleCodeExporter commented 9 years ago
[deleted comment]
GoogleCodeExporter commented 9 years ago
[deleted comment]
GoogleCodeExporter commented 9 years ago
[deleted comment]
GoogleCodeExporter commented 9 years ago
Work was done on this last night.

Original comment by nkoz...@gmail.com on 10 Apr 2009 at 12:20

GoogleCodeExporter commented 9 years ago
More work done on this today to include some new patterns for dateline 
detection.

Original comment by nkoz...@gmail.com on 10 Apr 2009 at 2:10

GoogleCodeExporter commented 9 years ago
I ran an import (v.84) and spot checked a couple result. In this one 
http://www.boston.com/news/local/massachusetts/articles/2009/04/09/mass_liquor_s
tore_offering_home_
delivery/?rss_id=Boston.com+--+Local+news

Dateline is Beverly, MA but tagged Boston...this was in the Boston.com news RSS 
import.

This one:
http://www.boston.com/news/local/massachusetts/articles/2009/04/09/mass_daily/?r
ss_id=Boston.com+-
-+Local+news

Dateline is Braintree but tag is Boston.

This one:

http://www.boston.com/news/local/massachusetts/articles/2009/04/09/partners_heal
thcare_passes_new_li
mits_on_gifts?rss_id=Boston.com+--+Local+news

has Boston dateline but no tag.

I know it is an ongoing thing, I just wanted to give you some examples for 
tuning...

Original comment by michaels...@gmail.com on 10 Apr 2009 at 2:54

GoogleCodeExporter commented 9 years ago
Another use case you might want to tune for...

I set up a Yelp feed in Boston, which has explicit addresses for businesses o 
the top of the page. I tried it with 
Strict settings and got almost no matches. With loose settings, I got either 
Boston or things mentioned in 
reviews as opposed to the business listing address.

Here's and example of one that was tagged as Boston;

http://www.yelp.com/biz/pho-hoa-ii-boston#hrid:qftwP4dLDsvUCnYHcP5P_g

Original comment by michaels...@gmail.com on 10 Apr 2009 at 1:23

GoogleCodeExporter commented 9 years ago
When you geo-bias it to a city does it make itself aware of nearby city names 
for pattern matching? that is what 
we did in teragram -- by saying that this was Gainesville content we looked for 
the couple dozen city names in 
the immediate vicinity... If the PL algorithm is doing that it doesn't seem to 
be doing it well as it rarely seems to 
find other cities beyond what it has been biased for.

if it does what i've described, after recognizing the proximate city ideally it 
would look for an address or place 
reference within a word or two of the city and use the proximate city as the 
bias for recognizing the address.

Original comment by michaels...@gmail.com on 17 Apr 2009 at 11:46

GoogleCodeExporter commented 9 years ago
Disallow patterns for "Is Street" (as in is street violence ruining...) and 
"Toy Drive". 

Original comment by michaels...@gmail.com on 17 Apr 2009 at 11:47

GoogleCodeExporter commented 9 years ago
Disallow patterns for "Is Street" (as in is street violence ruining...) and 
"Toy Drive". 

Original comment by michaels...@gmail.com on 17 Apr 2009 at 11:47

GoogleCodeExporter commented 9 years ago
The geocoder seems to tag single address multiple ways -- 100 Main Street and 
also Main Street

Original comment by michaels...@gmail.com on 17 Apr 2009 at 11:49

GoogleCodeExporter commented 9 years ago

Original comment by michaels...@gmail.com on 29 Dec 2009 at 7:13