Open missinglink opened 10 years ago
http://www.openstreetmap.org/way/10525375
[edit] wontfix re: adding highway
tags for the same reason as https://github.com/pelias/openstreetmap/issues/28 plus this is not a POI.
consider blacklisting public_transport:stop_area
http://www.openstreetmap.org/node/2478086894
[edit] it seems best to completely remove public_transport
because of duplicates, see https://gist.github.com/missinglink/b7757d98d034f1441b74
http://www.openstreetmap.org/node/2907810083
[edit] same as above, there are a lot of duplicates for railway:*
https://gist.github.com/missinglink/e7f1de91670e7173792c
http://www.openstreetmap.org/node/2478086894
[edit] same as public_transport:stop_area
above
Hi.
A lot of streets with in suburbs area in OSM have generally the tag highway+name. I understand why you are blacklisting them but I think this is really restrictive and a lot of existing streets in OSM are not imported because of this. And generally speaking, it seems that the blacklisting is there only to avoid duplicates. Isn't there another way to avoid duplicates and importing the data with the blacklisted tags ?
hi @razafinr crazy timing, I was just looking in to this issue earlier today, I've been cutting some data from a New York City extract to have a look what we might be missing and what we might remove.
that repo is here https://github.com/pelias/osm-featurelist-evaluation with a link to some cuts I made of the data today.
this extract is just the highway+name
records for NYC: https://raw.githubusercontent.com/pelias/osm-featurelist-evaluation/master/cuts/highway.text
that files shows what new names would be searchable in Pelias if we added that to the featurelist, I see there is a bunch of undesirable content such as:
node 2708075964 Lamp Post
node 2708075965 Lamp Post
node 2708075966 Lamp Post
node 2708075967 Lamp Post
node 2708075970 Lamp Post
node 2708075974 Lamp Post
node 2708075976 Lamp Post
node 2708075977 Lamp Post
node 2708075979 Lamp Post
node 2708075981 Lamp Post
node 2708075982 Lamp Post
node 2708075983 Lamp Post
node 2708075984 Lamp Post
node 2708075988 Lamp Post
...but there is also a load of good stuff in there, what are your thoughts after seeing this?
@missinglink Thanks for the repository. I was running it locally and there are definitely a lot more places that could be searchable if we import them (as I explained above, a lot of streets in poor-coverage area have the specific tag highway+name
). I think that what should be done is to import these data with no duplicates, although the duplicates could be used to have the best approximation of the latitude
and longitude
of the imported data. I know that it could be really hard to handle the ways
type in a pbf file but it would be a great improvement of the application if these data could be imported.
@razafinr I've made some changes to pbf2json
which will allow us to be more specific when targeting tags from the OSM extracts, eg. you can now say amenity~toilets,amenity~kindergarten
or whatever instead of extracting all records with an amenity:*
tag.
I think this will help significantly in reducing some of the rubbish which is getting ingested (ie. 'Lamp Post') but doesn't deal with the issue of duplication.
You mentioned ways
, these are currently being imported, the issue I'm having with duplication is that each road segment is modelled as separate way
and may be linked by a relation
(which we currently don't import), an example is:
In this case both the road segments share the same name
and also represent the same "thing", so in an ideal world we could mark them as duplicate, the issue is that; in some cases the tags are better on one record than the other and so they need to be merged, and in order to find a centroid we would need to do some serious math; especially for polygonal geometries like triangles and circles where the centroid does not lie on the border itself.
I would definitely like to find a de-duplication strategy for highway
segments which works, generally a 'good', yet naive de-duplicator checks the similarity in name with other entities which are geographically 'near' and then scores the 'closeness' with a scoring algorithm, this strategy is always going to be flawed and so I think using the relation
definitions where available will produce far more accurate results. I suspect that only a small amount of way
records with duplicate names are linked by a relation
record and so, yea, it's tricky. I will discuss with our routing team and see if they have some input on the matter.
here is an example of a relation which contains all the ways for the M1 in the UK: https://www.openstreetmap.org/relation/2332838
@razafinr it's also worth noting that street addresses are currently being extracted from addr:street
and addr:housenumber
tags, so if the query logic is correct then we should be able to return all 1 foo street
, 200 foo street
etc. for the input text foo street
.
eg: the POI "Happy Valley Cycles" [1] has valid address info, so a second document [2] is created with the name "8 Church Street", a query such as [3] should return all POI records on that street.
[1] http://pelias.mapzen.com/doc?id=osmnode:1030336099 [2] http://pelias.mapzen.com/doc?id=osmaddress:poi-address-osmnode-1030336099 [3] http://pelias.mapzen.com/search?input=church%20street&lat=-40.95&lon=175.66
One particular tag we might want to import is highway:residential
. It appears there are about 31 million such roads, so we can't take the addition of it lightly, but like @razafinr said, there is a ton of good data.
Here's one example of a road that we can't find using pelias, but works in Nominatim: https://mapzen.com/pelias#loc=10,48.4861,16.7569&q=am%20wirtshausberg&t=fine&gb=bbox http://nominatim.openstreetmap.org/search.php?q=am+wirtshausberg&viewbox=16.61%2C48.52%2C16.63%2C48.5
Hello.
I think it would be great if the tags could be set in the importer configuration. And if not set then load tags by default.
For example,
{
"imports": {
"openstreetmap": {
"datapath": "/data/pelias/openstreetmap",
"leveldbpath": "/tmp",
"import": [{
"filename": "somefile.osm.pbf"
}],
"tags": [
"place:city",
"place:town"
]
}
}
}
Also, not all relationships such as cities are enabled by default. For example, Moscow (http://openstreetmap.org/relation/2555133). It has `place=city' tag which is not included in current version of importer.
Of course, users can set them on their own by downloading this repo, but I believe that this would be a good feature.
Hi @CatInCosmicSpace, that sounds like a nice addition, the core team are currently working on other features at the moment but I'd be happy to review and merge a PR to add this functionality to pelias/config and pelias/openstreetmap.
For the most part, it could be implemented here https://github.com/pelias/openstreetmap/blob/master/stream/pbf.js
Regarding Moscow, I added support for relations recently, so please check you're using current versions of all our tools/docker images. If you're using a current version which includes other relations then it's likely due to the clipping model used to create the PBF extracts.
PBF extracts can be generated using a variety of different tools, and as such can be slightly different, even when cropping to the same bounding-box. If the Moscow polygon is missing any of the vertices or rings in the extract then the polygon will be deemed invalid, no attempts are made to try and recover the broken geometry and so it will not appear in the output, you should see a relevant warning in the log.
An easy solution to this problem is to use a larger extract which you are sure will include all the nodes in the relation members.
Another solution is to simply modify https://github.com/pelias/openstreetmap/blob/master/config/features.js to suit your needs.
If you are using Docker, then you can achieve this without modifying the docker image using a feature called 'bind mounts' which is available in Docker and docker-compose.
Here's an example of how to use bind-mounts in docker-compose
to override the file stored inside the container with one from your local machine:
https://github.com/pelias/docker/blob/master/projects/portland-metro/docker-compose.yml#L17
@missinglink
Thank you! Maybe I will make a pull request later. :)
We currently have this features whitelist which can be a little too inclusive sometimes.
eg. Taxiway 'U' at JFK, I doubt anyone will ever want to search this: http://www.openstreetmap.org/way/5784731#map=17/40.65536/-73.78819 [edit] probably best to disable some of the
aeroway
tags but not all of them https://gist.github.com/missinglink/c74dc7a2ba34bfc4bdcbThis ticket is to revisit the whitelist from the 'old pelias' and give it a second look/ refresh.