omniscale / imposm-parser

Deprecated: Python parser for OpenStreetMap data
http://imposm.org/docs/imposm.parser/latest/
Apache License 2.0
133 stars 57 forks source link

Incorrect parsing when no <node> elements present #19

Closed snakeye closed 7 years ago

snakeye commented 8 years ago

Hello,

I have found an issue with the parser recently. When the input file does not have <node> tags inside of it, the parse result is incorrect.

Sample code:

#!/usr/bin/env python

from imposm.parser import OSMParser

class ParseCallback(object):
    def nodes(self, nodes):
        print ' nodes: %d' % len(nodes)

    def ways(self, ways):
        print ' ways: %d' % len(ways)

    def relations(self, relations):
        print ' relations: %d' % len(relations)

    def coords(self, coords):
        print ' coords: %d' % len(coords)

handler = ParseCallback()

parser = OSMParser(concurrency=1,
                   ways_callback=handler.ways,
                   relations_callback=handler.relations,
                   nodes_callback=handler.nodes,
                   coords_callback=handler.coords)

for file in ['Malacca.theme-parks.osm', 'Malacca.theme-parks.2.osm']:
    print file
    parser.parse(file)

Sample file 1 (Malacca.theme-parks.osm):

<?xml version="1.0" encoding="UTF-8"?>
<osm version="0.6" generator="Overpass API">
<note>The data included in this document is from www.openstreetmap.org. The data is made available under ODbL.</note>
<meta osm_base="2016-08-24T13:26:03Z" areas="2016-08-23T01:33:02Z"/>

  <way id="170830704">
    <nd ref="1819670323"/>
    <nd ref="1819670381"/>
    <nd ref="1819670388"/>
    <nd ref="1819670409"/>
    <nd ref="1819670318"/>
    <nd ref="1819670351"/>
    <nd ref="1819670347"/>
    <nd ref="1819670376"/>
    <nd ref="1819670323"/>
    <tag k="tourism" v="theme_park"/>
  </way>
  <node id="1819670323" lat="2.2009914" lon="102.2490477"/>
  <node id="1819670376" lat="2.2008949" lon="102.2491738"/>
  <node id="1819670318" lat="2.2009512" lon="102.2495493"/>
  <node id="1819670347" lat="2.2008600" lon="102.2492811"/>
  <node id="1819670351" lat="2.2008846" lon="102.2494478"/>
  <node id="1819670381" lat="2.2012250" lon="102.2495819"/>
  <node id="1819670388" lat="2.2011070" lon="102.2496409"/>
  <node id="1819670409" lat="2.2010235" lon="102.2496217"/>

</osm>

Sample file 2 (Malacca.theme-parks.2.osm, as you can see the dummy node is added here):

<?xml version="1.0" encoding="UTF-8"?>
<osm version="0.6" generator="Overpass API">
<note>The data included in this document is from www.openstreetmap.org. The data is made available under ODbL.</note>
<meta osm_base="2016-08-24T13:26:03Z" areas="2016-08-23T01:33:02Z"/>

  <node id="0" lat="0" lon="0"/>
  <way id="170830704">
    <nd ref="1819670323"/>
    <nd ref="1819670381"/>
    <nd ref="1819670388"/>
    <nd ref="1819670409"/>
    <nd ref="1819670318"/>
    <nd ref="1819670351"/>
    <nd ref="1819670347"/>
    <nd ref="1819670376"/>
    <nd ref="1819670323"/>
    <tag k="tourism" v="theme_park"/>
  </way>
  <node id="1819670323" lat="2.2009914" lon="102.2490477"/>
  <node id="1819670376" lat="2.2008949" lon="102.2491738"/>
  <node id="1819670318" lat="2.2009512" lon="102.2495493"/>
  <node id="1819670347" lat="2.2008600" lon="102.2492811"/>
  <node id="1819670351" lat="2.2008846" lon="102.2494478"/>
  <node id="1819670381" lat="2.2012250" lon="102.2495819"/>
  <node id="1819670388" lat="2.2011070" lon="102.2496409"/>
  <node id="1819670409" lat="2.2010235" lon="102.2496217"/>

</osm>

Results:

Malacca.theme-parks.osm
 coords: 7
 coords: 1
 ways: 0
 nodes: 0
 relations: 0
Malacca.theme-parks.2.osm
 coords: 8
 coords: 1
 ways: 1
 nodes: 0
 relations: 0

As you can see way from the first file is not parsed.

mmd-osm commented 7 years ago

That's probably an issue with the Overpass query: OSM XML format requires to have all nodes first, followed by ways. If you use the typical overpass turbo style (out body; >; out skel;), you break this sequence. Just try (._;>;);out meta; instead.

snakeye commented 7 years ago

Yes, makes sense. Thank you!