mapzen / metroextractor-cities

JSON cities bounding boxes for chef-metroextractor
https://mapzen.com/data/metro-extracts
GNU General Public License v3.0
84 stars 155 forks source link

extract for Portland, Oregon contains data outside of bounding box #438

Closed dobratzp closed 8 years ago

dobratzp commented 8 years ago

I'm looking at the extract for Portland, Oregon and it looks like there is some data which doesn't belong.

portland_oregon.osm, line 19849425:

<way id="404118908" version="1" timestamp="2016-03-16T18:44:38Z" changeset="37879771" uid="2607707" user="Marcos Medeiros">
<nd ref="4063882877"/>
<tag k="name" v="Auto Elétrica Ituiutaba"/>
<tag k="shop" v="car_repair"/>
<tag k="phone" v="+55 34 3268-1246"/>
<tag k="building" v="yes"/>
</way>

The Portland extract has only one Node referenced from this Way, and that Node 4063882877 is not in the extract file.

OSM Way 404118908 is actually a closed Way object with 5 nodes denoting a car repair shop in Brazil. Here's what is looks like from the OSM API (still at version 1):

<way id="404118908" visible="true" version="1" changeset="37879771" timestamp="2016-03-16T18:44:38Z" user="Marcos Medeiros" uid="2607707">
<nd ref="4063882880"/>
<nd ref="4063882878"/>
<nd ref="4063882874"/>
<nd ref="4063882871"/>
<nd ref="4063882877"/>
<nd ref="4063882880"/>
<tag k="building" v="yes"/>
<tag k="name" v="Auto Elétrica Ituiutaba"/>
<tag k="phone" v="+55 34 3268-1246"/>
<tag k="shop" v="car_repair"/>
</way>

Neither this Way, nor any of the referenced Nodes should make their way into the Portland, Oregon extract as they should clearly be outside of the bounding box.

buma commented 8 years ago

There is also some nodes and ways from Germany in the extract from DC Baltimore

 $ grep 'id="149694"' dc-baltimore_maryland.osm_01.osm 
    <way id="149694" version="4" timestamp="2013-01-24T21:57:29Z" changeset="14775106" uid="5042" user="huskytreiber">

Way 149694. This in itself is strange but most problematic is that there are a lot of ways for which nodes doesn't exist in extract and it breaks my loading scripts.

Edit statistics: This is from workable extract seems from May last year

 osmconvert dc-baltimore_maryland.osm.pbf.OLD --out-statistics
timestamp min: 2005-11-30T04:26:09Z
timestamp max: 2015-05-30T01:53:55Z
lon min: -77.5989999
lon max: -76.0580010
lat min: 38.5390000
lat max: 39.6310000
nodes: 11715588
ways: 1314878
relations: 6459
node id min: 234661
node id max: 3557260796
way id min: 4415891
way id max: 349940888
relation id min: 218
relation id max: 5214556
keyval pairs max: 275
keyval pairs max object: relation 148838
noderefs max: 1970
noderefs max object: way 231269487
relrefs max: 741
relrefs max object: relation 4799101

And new one which doesn't work:

osmconvert dc-baltimore_maryland.osm.pbf --out-statistics
timestamp min: 2005-11-30T04:26:09Z
timestamp max: 2016-04-02T00:20:17Z
lon min: -77.5989999
lon max: -76.0580005
lat min: 38.5390000
lat max: 39.6310000
nodes: 12563456
ways: 1698343
relations: 22801
node id min: 234661
node id max: 4091227792
way id min: 531
way id max: 407085579
relation id min: 218
relation id max: 6096987
keyval pairs max: 321
keyval pairs max object: relation 52411
noderefs max: 1983
noderefs max object: way 356133005
relrefs max: 767
relrefs max object: relation 4799101

The most strange thing is that way ID min is smaller on new extract which is kinda impossible. And way with min ID 531 is in England UK.

IndyHurt commented 8 years ago

@dobratzp @buma this should be fixed via a prevent element leaks pull request. Check out the latest metro extracts available today (4/12/2016) and please let us know if the problem persists.

buma commented 8 years ago

This is definitely fixed for me in DC Baltimore extract. Thanks for the fix.