pelias / csv-importer

Import arbitrary data in CSV format to Pelias
MIT License
23 stars 21 forks source link

Search by Postcode #90

Closed bdmapman closed 3 years ago

bdmapman commented 3 years ago

Use-cases

I am trying to import a CSV with postcode data. It will be great if I can search by postcode as a parameter

Attempted Solutions

I made changes in the Pelias config file and add postcode in the coarse layer. but in search, I am unable to search with the value.

Proposal

The request can be something like this

missinglink commented 3 years ago

Hi @bdmapman, re "but in search, I am unable to search with the value", this seems to work for me:

You might be getting the parameters confused, the structured endpoint (/v1/search/structured) has a parameter named postalcode. -https://github.com/pelias/documentation/blob/master/structured-geocoding.md#structured-geocoding-parameters -https://github.com/pelias/documentation/blob/master/structured-geocoding.md#postalcode

There are a variety of filtering and scoring parameters on the fulltext endpoints /v1/search and /v1/autocomplete which you might be able to use, such as ?layers=postalcode but there is no parameter which allows you to specify the postcode separated from the rest of the ?text= param, this is what the /v1/search/structured endpoint is designed for.

see: https://github.com/pelias/documentation

Regarding /v1/reverse, what is the use-case for making a request when you already know the lat/lon and the postalcode? The reverse endpoint is designed to help you discover whats near a target lat/lon.

bdmapman commented 3 years ago

@missinglink Thanks for your prompt insight and Sorry for not giving the use-case properly. My data source is only CSV Importer. And I am using a long geocode (9-12 digits) as a postcode in my CSV. So that, besides names I can find any location by its geocodes. So the 12015 was only for example. Here is a sample file of my CSV structure. Is there any way to search based on Postalcode? I tried with a structured search. It is also not returning anything.

CleanShot 2021-08-02 at 16 31 45@2x

Do not worry about the source and layers. I have solved custom sources and layers manually.

missinglink commented 3 years ago

Ugh so it seems like you're trying to use the postalcode field for something it's not intended for? Each of the fields has semantic meaning which affects how the queries are generated.

I had a look at the code and the issue here is that the ?postcode=xxx param for /v1/search/structured is being ignored because you're not also supplying ?address=, this may or may not be ideal, I've opened an issue to discuss it with the team.

But regardless of that, the category field is a much better fit for what I think you're trying to do, its already set up to allow you to add arbitrary data and then filter on it.

Try changing the column name to category and then querying with ?categories=6022...., this should work for the /v1/search endpoint.

bdmapman commented 3 years ago

Unfortunately no luck. What I have done -

But there is no luck. It is also returning

        "errors": [
            "invalid param 'text': text length, must be >0",
            "invalid param 'text': text length, must be >0"
        ],

Is the category field is newly launched? My Pelias is 1.5 years old. Do I need to add any config in the pelias.json file?

missinglink commented 3 years ago

The ?text param is mandatory for /v1/search.

missinglink commented 3 years ago

Try /v1/search?text=Arifa&categories=6022645523

bdmapman commented 3 years ago

That's Great...That's Awesome!!!! Just out of curiosity I want to know, is the category field indexed or it's returning time will be higher than general searches? One more thing, If I want to include more than one category can I separate others with a comma? Is there any category documentation?

missinglink commented 3 years ago

Categories are unfortunately not well documented, you can use a comma to delimit multiple values, have a look at the geocoding.query.categories block in the response json to see how your input was interpreted.

is the category field indexed or it's returning time will be higher than general searches

It is indexed and should have a negligible impact on performance, likely it will be faster because it's excluding bunch of results it doesn't have to score

please keep in mind Pelias is not a relational database and it's not backed by one, so it's not like adding a column in SQL ;)

bdmapman commented 3 years ago

@missinglink I am really sorry for re-opening the issue. Just one thing I wanted to know is categories parameter usable in reverse geocoding API?

Another thing though I asked that in gitter. If I update source & layer data directly by using _update query of elasticsearch does that effect Pelias search API? After updating that specific entry vanished from my search API. Is it any kind of bug or restriction?

bdmapman commented 3 years ago

From this Discussion and this documentation I got to know that categories are currently available for search and autocomplete endpoints.

I am still searching for the second answer. I got some interesting findings probably know better than me. I am using the _update query of elasticsearch for a single entry

curl -X POST "localhost:9200/pelias/_update/custompoi:restaurant:45632?refresh=true&pretty=true" -H 'Content-Type: application/json' -d'
{
  "script" : "ctx._source.layer = \u0027shop\u0027"
}
'

It is changing the data. I can see the data using curl -X GET "localhost:9200/pelias/_doc/custompoi:restaurant:45632?pretty" through terminal. It's returning the changes of layer value to shop. In _gid field the value is still custompoi:restaurant:45632 whereas if I go through Postman and call the data using reverse or autocomplete API it is returning "gid": "custompoi:shop:45632",. It seems interesting that the _gid of elasticsearch and "gid" of API return are not the same. And yes, that specific entry is still missing from search API.