Open arya6000 opened 2 months ago
Did you try the Senzing provided model?
On Sat, Sep 21, 2024 at 16:12 arya6000 @.***> wrote:
Hi!
I was checking out libpostal, and saw something that could be improved.
My country is
US
Here's how I'm using libpostal
Parsing list of addresses in my city to store in a normalized relational database.
Here's what I did
Parsed the following address "1141 Kendall Town Blvd #3202, Jacksonville, FL 32225 https://www.google.com/maps/search/1141+Kendall+Town+Blvd+%233202,+Jacksonville,+FL+32225?entry=gmail&source=g "
Here's what I got
house_number: 1141 road: kendall town blvd #3202 city: jacksonville state: fl postcode: 32225
Here's what I was expecting
house_number: 1141 unit: #3202 road: kendall town blvd https://www.google.com/maps/search/kendall+town+blvd+%0D%0Acity:+jacksonville+%0D%0Astate:+fl+%0D%0Apostcode:+32225?entry=gmail&source=g city: jacksonville https://www.google.com/maps/search/kendall+town+blvd+%0D%0Acity:+jacksonville+%0D%0Astate:+fl+%0D%0Apostcode:+32225?entry=gmail&source=g state: fl https://www.google.com/maps/search/kendall+town+blvd+%0D%0Acity:+jacksonville+%0D%0Astate:+fl+%0D%0Apostcode:+32225?entry=gmail&source=g postcode: 32225 https://www.google.com/maps/search/kendall+town+blvd+%0D%0Acity:+jacksonville+%0D%0Astate:+fl+%0D%0Apostcode:+32225?entry=gmail&source=g
-
Does the input address exist in OpenStreetMap https://openstreetmap.org? No
Do all the toponyms exist in OSM (city, state, region names, etc.)? City and state are in OSM
If the address uses a rare/uncommon format, does changing the order of the fields yield the correct result? "1141 #3202 Kendall Town Blvd, Jacksonville, FL 32225 https://www.google.com/maps/search/Kendall+Town+Blvd,+Jacksonville,+FL+32225?entry=gmail&source=g" results in the following format
house_number: 1141 #3202 road: kendall town blvd https://www.google.com/maps/search/kendall+town+blvd+%0D%0Acity:+jacksonville+%0D%0Astate:+fl+%0D%0Apostcode:+32225?entry=gmail&source=g city: jacksonville https://www.google.com/maps/search/kendall+town+blvd+%0D%0Acity:+jacksonville+%0D%0Astate:+fl+%0D%0Apostcode:+32225?entry=gmail&source=g state: fl https://www.google.com/maps/search/kendall+town+blvd+%0D%0Acity:+jacksonville+%0D%0Astate:+fl+%0D%0Apostcode:+32225?entry=gmail&source=g postcode: 32225 https://www.google.com/maps/search/kendall+town+blvd+%0D%0Acity:+jacksonville+%0D%0Astate:+fl+%0D%0Apostcode:+32225?entry=gmail&source=g
But "#3202" should be in listed under "unit" and not house number. However "1141 apt 3202 Kendall Town Blvd, Jacksonville, FL 32225 https://www.google.com/maps/search/Kendall+Town+Blvd,+Jacksonville,+FL+32225?entry=gmail&source=g" outputs the correct format
house_number: 1141 unit: apt 3202 road: kendall town blvd https://www.google.com/maps/search/kendall+town+blvd+%0D%0Acity:+jacksonville+%0D%0Astate:+fl+%0D%0Apostcode:+32225?entry=gmail&source=g city: jacksonville https://www.google.com/maps/search/kendall+town+blvd+%0D%0Acity:+jacksonville+%0D%0Astate:+fl+%0D%0Apostcode:+32225?entry=gmail&source=g state: fl https://www.google.com/maps/search/kendall+town+blvd+%0D%0Acity:+jacksonville+%0D%0Astate:+fl+%0D%0Apostcode:+32225?entry=gmail&source=g postcode: 32225 https://www.google.com/maps/search/kendall+town+blvd+%0D%0Acity:+jacksonville+%0D%0Astate:+fl+%0D%0Apostcode:+32225?entry=gmail&source=g
- If the address contains apartment/floor/sub-building information or uncommon formatting, does removing that help? Is there any minimum form of the address that gets the right parse?
Yes, the following results in correct output
"1141 apt 3202 Kendall Town Blvd, Jacksonville, FL 32225 https://www.google.com/maps/search/Kendall+Town+Blvd,+Jacksonville,+FL+32225?entry=gmail&source=g "
Here's what I think could be improved
If "# followed by numbers is listed before the city it would be treated as unit number.
— Reply to this email directly, view it on GitHub https://github.com/openvenues/libpostal/issues/671, or unsubscribe https://github.com/notifications/unsubscribe-auth/AF6OZVH54OBZC3COXQSRSUTZXXHKHAVCNFSM6AAAAABOT3YFGKVHI2DSMVQWIX3LMV43ASLTON2WKOZSGU2DANBZGM3DMNA . You are receiving this because you are subscribed to this thread.Message ID: @.***>
Did you try the Senzing provided model?
I was not aware of Senzing. You are referring to this https://github.com/Senzing/libpostal-data ?
Yes. If you search the libpostal docs for alternative data models you should see how to enable it.
On Sat, Sep 21, 2024 at 18:55 arya6000 @.***> wrote:
Did you try the Senzing provided model?
I was not aware of Senzing. You are referring to this https://github.com/Senzing/libpostal-data ?
— Reply to this email directly, view it on GitHub https://github.com/openvenues/libpostal/issues/671#issuecomment-2365348330, or unsubscribe https://github.com/notifications/unsubscribe-auth/AF6OZVDLW2MSXXAM4TSLPRTZXX2MNAVCNFSM6AAAAABOT3YFGKVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDGNRVGM2DQMZTGA . You are receiving this because you commented.Message ID: @.***>
Yes. If you search the libpostal docs for alternative data models you should see how to enable it.
I just tried with a the Senzing model and it solved the issue. Thanks
Hi!
I was checking out libpostal, and saw something that could be improved.
My country is
US
Here's how I'm using libpostal
Parsing list of addresses in my city to store in a normalized relational database.
Here's what I did
Parsed the following address "1141 Kendall Town Blvd #3202, Jacksonville, FL 32225"
Here's what I got
house_number: 1141 road: kendall town blvd #3202 city: jacksonville state: fl postcode: 32225
Here's what I was expecting
house_number: 1141 unit: #3202 road: kendall town blvd city: jacksonville state: fl postcode: 32225
Does the input address exist in OpenStreetMap? No
Do all the toponyms exist in OSM (city, state, region names, etc.)? City and state are in OSM
If the address uses a rare/uncommon format, does changing the order of the fields yield the correct result? "1141 #3202 Kendall Town Blvd, Jacksonville, FL 32225" results in the following format
house_number: 1141 #3202 road: kendall town blvd city: jacksonville state: fl postcode: 32225
But "#3202" should be in listed under "unit" and not house number. However "1141 apt 3202 Kendall Town Blvd, Jacksonville, FL 32225" outputs the correct format
house_number: 1141 unit: apt 3202 road: kendall town blvd city: jacksonville state: fl postcode: 32225
Yes, the following results in correct output
"1141 apt 3202 Kendall Town Blvd, Jacksonville, FL 32225"
Here's what I think could be improved
If "# followed by numbers is listed before the city it would be treated as unit number.