openvenues / libpostal

A C library for parsing/normalizing street addresses around the world. Powered by statistical NLP and open geo data.
MIT License
4.06k stars 417 forks source link

road not detected for street terms in short form (in Russia) #498

Open Aknilam opened 4 years ago

Aknilam commented 4 years ago

Hi!

I was checking out libpostal, and saw something that could be improved.


My country is

RU


Here's how I'm using libpostal

rest api


Here's what I did - case 1

query=б-р Победы, д. 10


Here's what I got - case 1

[
  {
    "label": "house",
    "value": "б-р"
  },
  {
    "label": "road",
    "value": "победы"
  },
  {
    "label": "house_number",
    "value": "д. 10"
  }
]

Here's what I was expecting - case 1

[
  {
    "label": "road",
    "value": "б-р победы"
  },
  {
    "label": "house_number",
    "value": "д. 10"
  }
]

Here's what I did - case 2

query=б-р Солнечный 2


Here's what I got - case 2

[
  {
    "label": "house",
    "value": "б-р солнечный"
  },
  {
    "label": "house_number",
    "value": "2"
  }
]

Here's what I was expecting - case 2

[
  {
    "label": "road",
    "value": "б-р солнечный"
  },
  {
    "label": "house_number",
    "value": "2"
  }
]

Here's what I did - case 3

query=Савелкинский пр-д, д. 4


Here's what I got - case 3

[
  {
    "label": "house",
    "value": "савелкинский"
  },
  {
    "label": "road",
    "value": "пр-д д."
  },
  {
    "label": "house_number",
    "value": "4"
  }
]

Here's what I was expecting - case 3

[
  {
    "label": "road",
    "value": "савелкинский пр-д"
  },
  {
    "label": "house_number",
    "value": "д. 4"
  }
]

For parsing issues, please answer "yes" or "no" to all that apply.

has helped and provided the correct results in the expected form (but only with the expanded values).


Here's what I think could be improved

add б-р as an alias/synonym to бульвар add пр-д as an alias/synonym to проезд

xTRiM commented 3 years ago

Could probably be solved by changing here: https://github.com/openvenues/libpostal/blob/master/resources/dictionaries/ru/street_types.txt#L3-L4 to бульвар|бул|б-р bulvar|bul|b-r

Same for "пр-д".