osm2pgsql-dev / osm2pgsql

OpenStreetMap data to PostgreSQL converter
https://osm2pgsql.org
GNU General Public License v2.0
1.5k stars 474 forks source link

bad result during COPY #159

Closed nahamada closed 3 years ago

nahamada commented 10 years ago

Hello,

I cloned Osm2pgsql from github and tried to import some data with it. It worked fine with a for Belgium and Luxembourg extract, so i decided to increase the size to the Netherlands (extract from http://download.geofabrik.de/ ).

It when bad from here: it crash after processing 90M nodes , but no ways or relations

./osm2pgsql -l -x -v  --number-processes 4  --cache-strategy sparse -p carto_ne  --style default.style -d carto -s -H <host>  -U <User> download/netherlands-latest.osm.pbf                                                                                                  
osm2pgsql SVN version 0.85.0 (64bit id space)                                                                                                                          

Using projection SRS 4326 (Latlong)
Setting up table: carto_ne_point   
NOTICE:  table "carto_ne_point" does not exist, skipping
NOTICE:  table "carto_ne_point_tmp" does not exist, skipping
Setting up table: carto_ne_line                             
NOTICE:  table "carto_ne_line" does not exist, skipping     
NOTICE:  table "carto_ne_line_tmp" does not exist, skipping 
Setting up table: carto_ne_polygon                          
NOTICE:  table "carto_ne_polygon" does not exist, skipping  
NOTICE:  table "carto_ne_polygon_tmp" does not exist, skipping
Setting up table: carto_ne_roads                              
NOTICE:  table "carto_ne_roads" does not exist, skipping      
NOTICE:  table "carto_ne_roads_tmp" does not exist, skipping  
Using built-in tag processing pipeline                        
Allocating memory for sparse node cache                       
Node-cache: cache=800MB, maxblocks=102400*8192, allocation method=9
Mid: pgsql, scale=10000000 cache=800                               
Setting up table: carto_ne_nodes                                   
NOTICE:  table "carto_ne_nodes" does not exist, skipping           
NOTICE:  CREATE TABLE / PRIMARY KEY will create implicit index "carto_ne_nodes_pkey" for table "carto_ne_nodes"
Setting up table: carto_ne_ways                                                                                
NOTICE:  table "carto_ne_ways" does not exist, skipping                                                        
NOTICE:  CREATE TABLE / PRIMARY KEY will create implicit index "carto_ne_ways_pkey" for table "carto_ne_ways"  
Setting up table: carto_ne_rels                                                                                
NOTICE:  table "carto_ne_rels" does not exist, skipping                                                        
NOTICE:  CREATE TABLE / PRIMARY KEY will create implicit index "carto_ne_rels_pkey" for table "carto_ne_rels"  

Reading in file: download/netherlands-latest.osm.pbf
Processing: Node(94050k 26.1k/s) Way(0k 0.00k/s) Relation(0 0.00/s) 
carto_ne_line - bad result during COPY, data 4009112    \N      \N      \N      \N      \N      \N      \N      \N      \N      \N      \N      \N      \N      \N     \N       \N      \N      \N      \N      \N      \N      \N      yes     \N      \N      footway \N      \N      \N      \N      \N      \N      \N      \N      \N     \N       \N      Lijnpadstraat   \N      \N      \N      \N      \N      \N      \N      \N      \N      \N      \N      \N      \N      \N      \N      \N      \N     \N       \N      \N      \N      \N      \N      \N      \N      \N      \N      0       \N      \N      \N      \N      \N      3       \N      SRID=4326;LINESTRING (5.1221357000000003 52.0801506999999972, 5.1218190000000003 52.0802778999999987)                                                                                           

Error occurred, cleaning up

I checked the MD5, ok. When I tried again, it is now stuck on NOTICE: table "carto_ne_line_tmp" does not exist, skipping

After posting to the mailing list, they advice me to drop the db and retry without cache-strategy (I add flat-nodes for space trouble), but got the same issue.

I'm running on CentOS 6.4, and PostGres 9.0.6 for the database. The PostGIS_full_version() is "POSTGIS="1.5.3" GEOS="3.3.0-CAPI-1.7.0" PROJ="Rel. 4.7.1, 23 September 2009" LIBXML="2.6.26" USE_STATS"

Hope this will help.

Nourdine.

nahamada commented 10 years ago

Actually, when I add a PQErrorMessage() where it crash, It say it's a "timeout connection to the database". When I give it a little more RAM, the processing go faster, I can process the nodes, ways, etc.... but I timeout again during a commit. As I said before, when I tried with small extract, it went smoothly, because they wasn't enough data to timeout.

I tried to modified pgsql.c and middle-pgsql.c to add PQstatus() and PQreset() where if failed, but without success.

Do you have any idea why I timeout on Postgres : osm2pgsql connexion, postgres configuration and how to solve it ?

pnorman commented 10 years ago

Anything in the postgres error logs?

nahamada commented 10 years ago

I've got a lot different, but it's basically a lost connection on different stage. The most interesting seems to be:

ERROR:  unexpected message type 0x58 during COPY from stdin
CONTEXT:  COPY carto_ne_polygon, line 1
STATEMENT:  COPY carto_ne_polygon (osm_id,"access", ...) FROM STDIN
LOG:  unexpected EOF on client connection
.....
LOG:  unexpected EOF on client connection
LOG:  could not send data to client: Broken pipe
....
LOG:  checkpoints are occurring too frequently (3 seconds apart)
HINT:  Consider increasing the configuration parameter "checkpoint_segments".
.....
LOG:  could not receive data from client: Connection timed out
LOG:  unexpected EOF on client connection
.....
ERROR:  unexpected EOF on client connection
CONTEXT:  COPY carto_osm_de_polygon, line 1
STATEMENT:  COPY carto_osm_de_polygon (osm_id,"access", ...) FROM STDIN
ERROR:  unexpected EOF on client connection
CONTEXT:  COPY carto_osm_de_nodes, line 1
STATEMENT:  COPY carto_osm_de_nodes FROM STDIN;

ERROR:  unexpected EOF on client connection
CONTEXT:  COPY carto_osm_de_ways, line 1
STATEMENT:  COPY carto_osm_de_ways FROM STDIN;

ERROR:  unexpected EOF on client connection
CONTEXT:  COPY carto_osm_de_rels, line 1
STATEMENT:  COPY carto_osm_de_rels FROM STDIN;
pnorman commented 10 years ago

ERROR: unexpected message type 0x58 during COPY from stdin

Is this the first error?

nahamada commented 10 years ago

Yes

pnorman commented 10 years ago

Possibly relevant

zeppelinen commented 6 years ago

The same problem encountered during import of 2GB pbf. Default postgres 9.5 settings, host Ubuntu Xenial, Openstack VM with 48GB RAM, SSD storage and 12 cores.

Error log:

2018-10-21 21:48:03 == Import data
osm2pgsql version 0.93.0-dev (64 bit id space)

Using projection SRS 4326 (Latlong)
NOTICE:  table "place" does not exist, skipping
Allocating memory for dense node cache
Allocating dense node cache in one big chunk
Allocating memory for sparse node cache
Sharing dense sparse
Node-cache: cache=7467MB, maxblocks=119472*65536, allocation method=11
Mid: pgsql, cache=7467
Setting up table: planet_osm_nodes
Setting up table: planet_osm_ways
Setting up table: planet_osm_rels

Reading in file: /data/russia-latest.osm.pbf
Using PBF parser.
Processing: Node(7530k 134.5k/s) Way(0k 0.00k/s) Relation(0 0.00/s)node cache: stored: 7616000(100.00%), storage efficiency: 50.51% (dense blocks: 122, sparse nodes: 7039215), hit rate: -nan%
ERROR:  unexpected message type 0x58 during COPY from stdin
CONTEXT:  COPY planet_osm_rels, line 1
STATEMENT:  COPY planet_osm_rels FROM STDIN;

ERROR:  unexpected message type 0x58 during COPY from stdin
CONTEXT:  COPY planet_osm_ways, line 1
STATEMENT:  COPY planet_osm_ways FROM STDIN;

LOG:  could not send data to client: Broken pipe
STATEMENT:  COPY planet_osm_rels FROM STDIN;

LOG:  could not send data to client: Broken pipe
STATEMENT:  COPY planet_osm_ways FROM STDIN;

FATAL:  connection to client lost
FATAL:  connection to client lost
Osm2pgsql failed due to ERROR: PBF error: truncated data (EOF encountered)
LOG:  incomplete message from client
CONTEXT:  COPY place, line 48838
STATEMENT:  COPY place (osm_type, osm_id, class, type, name, admin_level, address, extratags, geometry) FROM STDIN
ERROR:  unexpected EOF on client connection with an open transaction
CONTEXT:  COPY place, line 48838
STATEMENT:  COPY place (osm_type, osm_id, class, type, name, admin_level, address, extratags, geometry) FROM STDIN
LOG:  could not send data to client: Broken pipe
STATEMENT:  COPY place (osm_type, osm_id, class, type, name, admin_level, address, extratags, geometry) FROM STDIN
ERROR: Error executing external command: /app/src/build/osm2pgsql/osm2pgsql -lsc -O gazetteer --hstore --number-processes 1 -C 7467 -P 5432 -d nominatim /data/russia-latest.osm.pbf
Error executing external command: /app/src/build/osm2pgsql/osm2pgsql -lsc -O gazetteer --hstore --number-processes 1 -C 7467 -P 5432 -d nominatim /data/russia-latest.osm.pbf
FATAL:  terminating connection because protocol synchronization was lost
ERROR:  unexpected message type 0x58 during COPY from stdin
CONTEXT:  COPY planet_osm_nodes, line 7616001
STATEMENT:  COPY planet_osm_nodes FROM STDIN;

LOG:  could not send data to client: Broken pipe
STATEMENT:  COPY planet_osm_nodes FROM STDIN;

waiting for server to shut down....LOG:  received fast shutdown request
LOG:  aborting any active transactions
LOG:  autovacuum launcher shutting down
FATAL:  terminating connection due to administrator command
LOG:  shutting down
LOG:  database system is shut down
 done
server stopped
and-semakin commented 5 years ago

Found this issue googling for "ERROR: unexpected message type 0x58 during COPY from stdin".

I also experiencing some issue with this error.

osm2pgsql logs:

osm2pgsql version 0.96.0 (64 bit id space)

Using built-in tag processing pipeline
Using projection SRS 4326 (Latlong)
Setting up table: planet_osm_point
Setting up table: planet_osm_line
Setting up table: planet_osm_polygon
Setting up table: planet_osm_roads
Allocating memory for dense node cache
Allocating dense node cache in one big chunk
Allocating memory for sparse node cache
Sharing dense sparse
Node-cache: cache=800MB, maxblocks=12800*65536, allocation method=3

Reading in file: /tmp/tmp97xxauxj
Using XML parser.
  parse time: 0s
Node stats: total(3), max(428204965) in 0s
Way stats: total(0), max(0) in 0s
Relation stats: total(0), max(0) in 0s
Committing transaction for planet_osm_point
Committing transaction for planet_osm_line
Committing transaction for planet_osm_polygon
Committing transaction for planet_osm_roads
Using built-in tag processing pipeline
Using built-in tag processing pipeline
Using built-in tag processing pipeline
Using built-in tag processing pipeline
Using built-in tag processing pipeline
Using built-in tag processing pipeline
Using built-in tag processing pipeline
Using built-in tag processing pipeline
Using built-in tag processing pipeline
Using built-in tag processing pipeline
Using built-in tag processing pipeline
Using built-in tag processing pipeline
Using built-in tag processing pipeline
Using built-in tag processing pipeline
Using built-in tag processing pipeline
Using built-in tag processing pipeline
Using built-in tag processing pipeline
Using built-in tag processing pipeline
Using built-in tag processing pipeline
Using built-in tag processing pipeline
Using built-in tag processing pipeline
node cache: stored: 3(100.00%), storage efficiency: 0.04% (dense blocks: 1, sparse nodes: 0), hit rate: -nan%
Osm2pgsql failed due to ERROR: SET synchronous_commit TO off; failed: ERROR:  query_wait_timeout
server closed the connection unexpectedly
    This probably means the server terminated abnormally
    before or while processing the request.

PostgreSQL logs:

CONTEXT:  COPY planet_osm_point, line 1
STATEMENT:  COPY planet_osm_point (osm_id,"access","addr:housename","addr:housenumber","addr:interpolation","admin_level","aerialway","aeroway","amenity","area","barrier","bicycle","brand","bridge","boundary","building","capital","construction","covered","culvert","cutting","denomination","disused","ele","embankment","foot","generator:source","harbour","highway","historic","horse","intermittent","junction","landuse","layer","leisure","lock","man_made","military","motorcar","name","natural","office","oneway","operator","place","population","power","power_source","public_transport","railway","ref","religion","route","service","shop","sport","surface","toll","tourism","tower:type","tunnel","water","waterway","wetland","width","wood","z_order",tags,way) FROM STDIN
ERROR:  unexpected message type 0x58 during COPY from stdin
CONTEXT:  COPY planet_osm_line, line 1
STATEMENT:  COPY planet_osm_line (osm_id,"access","addr:housename","addr:housenumber","addr:interpolation","admin_level","aerialway","aeroway","amenity","area","barrier","bicycle","brand","bridge","boundary","building","construction","covered","culvert","cutting","denomination","disused","embankment","foot","generator:source","harbour","highway","historic","horse","intermittent","junction","landuse","layer","leisure","lock","man_made","military","motorcar","name","natural","office","oneway","operator","place","population","power","power_source","public_transport","railway","ref","religion","route","service","shop","sport","surface","toll","tourism","tower:type","tracktype","tunnel","water","waterway","wetland","width","wood","z_order","way_area",tags,way) FROM STDIN
ERROR:  unexpected message type 0x58 during COPY from stdin
CONTEXT:  COPY planet_osm_polygon, line 1
STATEMENT:  COPY planet_osm_polygon (osm_id,"access","addr:housename","addr:housenumber","addr:interpolation","admin_level","aerialway","aeroway","amenity","area","barrier","bicycle","brand","bridge","boundary","building","construction","covered","culvert","cutting","denomination","disused","embankment","foot","generator:source","harbour","highway","historic","horse","intermittent","junction","landuse","layer","leisure","lock","man_made","military","motorcar","name","natural","office","oneway","operator","place","population","power","power_source","public_transport","railway","ref","religion","route","service","shop","sport","surface","toll","tourism","tower:type","tracktype","tunnel","water","waterway","wetland","width","wood","z_order","way_area",tags,way) FROM STDIN
LOG:  could not send data to client: Broken pipe
FATAL:  connection to client lost
LOG:  could not send data to client: Broken pipe
FATAL:  connection to client lost
ERROR:  unexpected message type 0x58 during COPY from stdin
CONTEXT:  COPY planet_osm_point, line 1
STATEMENT:  COPY planet_osm_point (osm_id,"access","addr:housename","addr:housenumber","addr:interpolation","admin_level","aerialway","aeroway","amenity","area","barrier","bicycle","brand","bridge","boundary","building","capital","construction","covered","culvert","cutting","denomination","disused","ele","embankment","foot","generator:source","harbour","highway","historic","horse","intermittent","junction","landuse","layer","leisure","lock","man_made","military","motorcar","name","natural","office","oneway","operator","place","population","power","power_source","public_transport","railway","ref","religion","route","service","shop","sport","surface","toll","tourism","tower:type","tunnel","water","waterway","wetland","width","wood","z_order",tags,way) FROM STDIN
LOG:  could not send data to client: Broken pipe
FATAL:  connection to client lost
LOG:  could not send data to client: Broken pipe
FATAL:  connection to client lost
ERROR:  unexpected message type 0x58 during COPY from stdin
CONTEXT:  COPY planet_osm_line, line 1
STATEMENT:  COPY planet_osm_line (osm_id,"access","addr:housename","addr:housenumber","addr:interpolation","admin_level","aerialway","aeroway","amenity","area","barrier","bicycle","brand","bridge","boundary","building","construction","covered","culvert","cutting","denomination","disused","embankment","foot","generator:source","harbour","highway","historic","horse","intermittent","junction","landuse","layer","leisure","lock","man_made","military","motorcar","name","natural","office","oneway","operator","place","population","power","power_source","public_transport","railway","ref","religion","route","service","shop","sport","surface","toll","tourism","tower:type","tracktype","tunnel","water","waterway","wetland","width","wood","z_order","way_area",tags,way) FROM STDIN
LOG:  could not send data to client: Broken pipe
FATAL:  connection to client lost
ERROR:  unexpected message type 0x58 during COPY from stdin
CONTEXT:  COPY planet_osm_polygon, line 1
STATEMENT:  COPY planet_osm_polygon (osm_id,"access","addr:housename","addr:housenumber","addr:interpolation","admin_level","aerialway","aeroway","amenity","area","barrier","bicycle","brand","bridge","boundary","building","construction","covered","culvert","cutting","denomination","disused","embankment","foot","generator:source","harbour","highway","historic","horse","intermittent","junction","landuse","layer","leisure","lock","man_made","military","motorcar","name","natural","office","oneway","operator","place","population","power","power_source","public_transport","railway","ref","religion","route","service","shop","sport","surface","toll","tourism","tower:type","tracktype","tunnel","water","waterway","wetland","width","wood","z_order","way_area",tags,way) FROM STDIN
LOG:  could not send data to client: Broken pipe
ERROR:  unexpected message type 0x58 during COPY from stdin
CONTEXT:  COPY planet_osm_roads, line 1
STATEMENT:  COPY planet_osm_roads (osm_id,"access","addr:housename","addr:housenumber","addr:interpolation","admin_level","aerialway","aeroway","amenity","area","barrier","bicycle","brand","bridge","boundary","building","construction","covered","culvert","cutting","denomination","disused","embankment","foot","generator:source","harbour","highway","historic","horse","intermittent","junction","landuse","layer","leisure","lock","man_made","military","motorcar","name","natural","office","oneway","operator","place","population","power","power_source","public_transport","railway","ref","religion","route","service","shop","sport","surface","toll","tourism","tower:type","tracktype","tunnel","water","waterway","wetland","width","wood","z_order","way_area",tags,way) FROM STDIN
FATAL:  connection to client lost
LOG:  could not send data to client: Broken pipe
FATAL:  connection to client lost
ERROR:  unexpected message type 0x58 during COPY from stdin
...

Full log here: https://gist.github.com/and-semakin/46270e9289f1bb46a39bc879f465cbcd

I don't know if it matters or not but I'm connecting to PostgreSQL through pgbouncer in session pooling mode.

Version Info:

PostgreSQL 9.5.11 on x86_64-pc-linux-gnu (Debian 9.5.11-1.pgdg80+1), compiled by gcc (Debian 4.9.2-10) 4.9.2, 64-bit

POSTGIS="2.3.7 r16523" PGSQL="95" GEOS="3.4.2-CAPI-1.8.2 r3921" PROJ="Rel. 4.8.0, 6 March 2012" GDAL="GDAL 1.10.1, released 2013/08/26" LIBXML="2.9.1" LIBJSON="0.11.99" RASTER

osm2pgsql version 0.96.0

pgbouncer version 1.9.0
joto commented 3 years ago

We never got a clear handle on this but it might be that this was just symptoms of timeouts or something like that. And the code has changed so much over the years that it doesn't make sense to keep this open. If you experience anything like this reproducably, please open a new issue describing exactly what has happened.