osm2pgsql-dev / osm2pgsql

OpenStreetMap data to PostgreSQL converter
https://osm2pgsql.org
GNU General Public License v2.0
1.5k stars 474 forks source link

Incorrect multi backend linestring geoms #274

Closed pnorman closed 9 years ago

pnorman commented 9 years ago

When importing data to a linestring table with the multi backend the linestrings go crazy. crazy linestrings This issue is specific to the multi-backend and specific to linestrings.

There is also a strong performance decrease when adding a linestring table that may be related.

pnorman commented 9 years ago

Just a thought as I'm reducing to a testcase - is this related to polygon/line in the return values of the ways function?

pnorman commented 9 years ago

A shorter testcase

[
  {
    "name": "test_line",
    "type": "line",
    "tagtransform": "multi_line.lua",
    "tagtransform-node-function": "drop_all",
    "tagtransform-way-function": "test_ways",
    "tagtransform-relation-function": "drop_all",
    "tagtransform-relation-member-function": "drop_all",
    "tags": [
      {"name": "foo", "type": "text"}
    ]
  }
]

multi_line.lua

function drop_all (...)
  return 1, {}
end

-- A generic way to process ways, given a function which determines if tags are interesting
-- Takes an optional function to process tags. Always says it's a polygon if there's matching tags
function test_ways (kv, num_keys)
  tags = {}
  return 0, tags, 1, 0
end

Edit: Testcase had a bug and was not a reduction of the bug.

pnorman commented 9 years ago

So, this is a strange one. Using the above style which creates a linestring for any way, I started with the relation/full for the BC 91 and started deleting ways to get to a smaller test case. Being a result from cgimap I made sure to use --slim.

Eventually with an XML file with all the nodes and two ways (35354811, 35354763) I had 35354763 import with the wrong geometry. I then deleted all other nodes, renumbered them, and got the following XML, which reproduces it.

<osm version="0.6">
 <node id="1" lat="49.0919077" lon="-122.8938177"/>
 <node id="2" lat="49.0925635" lon="-122.8941386"/>
 <node id="3" lat="49.1694335" lon="-122.9990449"/>
 <node id="4" lat="49.1694438" lon="-122.9983216"/>
 <node id="5" lat="49.0922320" lon="-122.8939654"/>
 <way id="1">
  <nd ref="1"/>
  <nd ref="5"/>
  <nd ref="2"/>
 </way>
 <way id="2">
  <nd ref="3"/>
  <nd ref="4"/>
 </way>
</osm>

Some other observations

pnorman commented 9 years ago
gis=# select osm_id, st_astext(way) from test_line;
 osm_id |                                      st_astext
--------+-------------------------------------------------------------------------------------
      1 | LINESTRING(-122.8938176 49.0919077,-122.8939653 49.092232,-122.8941385 49.0925635)
      2 | LINESTRING(-122.9990448 49.1694335,-122.9983215 49.1694438,-122.8941385 49.0925635)
(2 rows)
lonvia commented 9 years ago

Thanks, that made it easy to find the bug in way_helper::set. As the size of the node_cache is only changed here when the ways is longer than the previously requested way, the return here is obviously too large if the way is smaller. Unfortunately, node_cache.size() is used in output-multi.cpp as well.

The correct fix would be to make nodes_get_list use std::vector and set the size correctly. Undecided right now what the second best fix is. Probably a second resize() just before the return using the result of nodes_get_list.

pnorman commented 9 years ago

the linestrings are the positions of nodes 1, 5, 2 for way 1 and nodes 3, 4, 2 for way 2.