pgRouting / osm2pgrouting

Import tool for OpenStreetMap data to pgRouting database
https://pgrouting.org
GNU General Public License v2.0
286 stars 113 forks source link

Way not processed when containing 	 character #259

Open PanierAvide opened 4 years ago

PanierAvide commented 4 years ago

Hello,

First, thanks for all the work done around osm2pgrouting, which is an amazing tool. I'm having some errors when a tag value contains the 	 character (seems to be tabulation). When a way contain such tag, it creates an error on PostgreSQL side (extra data after last expected column), and way isn't inserted in database. I suspect that this character is interpreted as a raw tabulation when SQL command is sent to PostgreSQL, making it like the column separator.

Best regards.

Bug context

Example OSM file

<?xml version='1.0' encoding='UTF-8'?>
<osm version="0.6" generator="osmium/1.11.0">
  <bounds minlat="41.257249" minlon="-6.3" maxlat="51.32937" maxlon="9.9
9206"/>
  <node id="1" lat="49.1159663" lon="2.5549495"/>
  <node id="2" lat="49.1102941" lon="2.5521725"/>
  <way id="3">
    <nd ref="1" />
    <nd ref="2" />
    <tag k="highway" v="residential" />
    <tag k="name" v="Rond-point du&#x9;Croissant" />
  </way>
</osm>

Error

Export Ways ...
    Processing 1 ways:
ERROR:  extra data after last expected column
CONTEXT:  COPY __ways3735, line 1: " 112    3   50  50  0   UNKNOWN 2.5 0.0063155032926911527   2.5549495   49.11596632.5521725 49.1102941  1   2   srid=..."

Full log

[adrien@pc-portable Technique]$ psql -U postgres -c "create database pgroutingdebug"
CREATE DATABASE
[adrien@pc-portable Technique]$ psql -U postgres -d pgroutingdebug -c "create extension postgis; create extension pgrouting; create extension hstore;"
CREATE EXTENSION
[adrien@pc-portable Technique]$ osm2pgrouting -f osm2pgrouting_bug.osm -c /usr/local/share/osm2pgrouting/mapconfig_for_cars.xml --tags -d pgroutingdebug -U postgres
Execution starts at: Mon Nov 18 09:11:23 2019

***************************************************
           COMMAND LINE CONFIGURATION             *
***************************************************
Filename = osm2pgrouting_bug.osm
Configuration file = /usr/local/share/osm2pgrouting/mapconfig_for_cars.xml
host = localhost
port = 5432
dbname = pgroutingdebug
username = postgres
schema= 
prefix = 
suffix = 
Don't drop tables
Don't create indexes
Don't add OSM nodes
***************************************************
Testing database connection: pgroutingdebug
database connection successful: pgroutingdebug
Connecting to the database
connection success

Creating tables...
TABLE: ways_vertices_pgr created ... OK.
TABLE: ways created ... OK.
TABLE: pointsofinterest created ... OK.
TABLE: configuration created ... OK.
Opening configuration file: /usr/local/share/osm2pgrouting/mapconfig_for_cars.xml
    Parsing configuration

Exporting configuration ...
  - Done 
Counting lines ...
  - Done 
Opening data file: osm2pgrouting_bug.osm    total lines: 13
    Parsing data

End Of file

    Finish Parsing data

Adding auxiliary tables to database...

Export Ways ...
    Processing 1 ways:
ERROR:  extra data after last expected column
CONTEXT:  COPY __ways3735, line 1: " 112    3   50  50  0   UNKNOWN 2.5 0.0063155032926911527   2.5549495   49.11596632.5521725 49.1102941  1   2   srid=..."
[**************************************************|] (100%) Total processed: 1  Vertices inserted: 0   Split ways inserted 0

Creating indexes ...

Processing Points of Interest ...
#########################
size of streets: 1
Execution started at: Mon Nov 18 09:11:23 2019
Execution ended at:   Mon Nov 18 09:11:23 2019
Elapsed time: 0.255 Seconds.
User CPU time: -> 0.019356 seconds
#########################
[adrien@pc-portable Technique]$ psql -d pgroutingdebug -U postgres -c "select count(*) from ways;"
 count 
-------
     0
(1 ligne)
ildar15 commented 3 years ago

Yes, I experienced that too. And many more similar issues while using large osm file. It fails because is encoding for TAB. And code is actually using TAB as a field separator in SQL for inserts. So, this extra TAB should either be escaped properly or just cleaned as a pre-processing step, for example

grep --in-place -e 's/&#9;/ /g'  your_file.osm 

to replace it with SPACE.
That process, of finding small issues like that and trying to fix them and rerun again, would be easier to use if instead of executing whole process in one run I could restart from the point where I was left. For example, if I have already ingested nodes, I would prefer next time start from osm_ways. Even better, if I could get SQL statements generated for inserting data, instead of loading data right away.

hm-lijojohn commented 2 years ago

How to fix this issue in Mac and Windows environment. I am getting this issue for most of the data now a days..

marcusyoung commented 2 years ago

Bump