reitti / reittiopas

Ihan hyvä reittiopas
http://ihanhyvareittiopas.fi/
MIT License
25 stars 2 forks source link

Integrated Helsinki's palvelukartta data to search index. #160

Open JaniL opened 11 years ago

JaniL commented 11 years ago

https://github.com/reitti/reittiopas/issues/158

Hi,

I decided to do something about this. Following pull request adds 10 844 places to the search index. The implementation uses my pk-extraction script, which is powered by my wrapper module for palvelukartta's rest api.

pk-extraction needs to be installed globally in order to work with current implementation.

npm install -g git://github.com/JaniL/pk-extraction.git

teropa commented 11 years ago

Cool! Will get back to this in a week or so, unless @sluukkonen has a chance to get into it before that.

sluukkonen commented 11 years ago

Nice work, and it's time to update the search index in any case.

Thanks a lot!

sluukkonen commented 11 years ago

One thing this needs to do is to separate the results by city, as the search index currently extracts the city name from the filename (e.g. helsinki.txt -> Helsinki). So if you could group the results by the city and add similar command-line parameters than in kalkati-extraction, we could merge this right away.

The whole update process could use some rethinking, but at least my time & motivation to work on it is limited, so this is probably the least painful way to integrate the results.

JaniL commented 11 years ago

Done. The tool is updated now too, so fetch the newest version.

Changelog of the extraction tool:

sluukkonen commented 11 years ago

Thanks! Going to merge and deploy this later today.

sluukkonen commented 11 years ago

Ok, I took a look at this.

The data import works well, but I think we should do some additional filtering on the Palvelukartta data. For example, you get a lot of results like "Brahenpuiston koulu 2013-2014", "ltapäivätoiminta / Brahenpuiston koulu, Opetustoimi" and "Brahenpuiston koulu, kouluterveydenhuolto" when we already have "Brahenpuiston koulu" indexed from the OpenStreetMap data.

I'm not really sure what the biggest pitfalls of our OSM data are, are there specific categories where the Palvelukartta data would really help?

JaniL commented 11 years ago

That's hard to say as I'm not a frequent user of OSM, and neither Palvelukartta.

If no one hasn't any suggestions then this should be closed as the data can't be imported without heavy filtering.

sluukkonen commented 11 years ago

Integrating Palvelukartta data was originally Tero's idea - let's see if he has some ideas about how we should use it.