osm2pgsql-dev / osm2pgsql

OpenStreetMap data to PostgreSQL converter
https://osm2pgsql.org
GNU General Public License v2.0
1.51k stars 474 forks source link

osm2pgsql crashes (silently, but creates windows log event) when importing bz2 data #903

Closed BadLambdaJamma closed 4 years ago

BadLambdaJamma commented 5 years ago

I have downloaded the latest 64-bit windows build from appveyor (osm2pgsql_Release_x64.zip) I am running on a fully patched 64 bit windows system(windows 10) with 6GB of memory and a Core I7 CPU.

When I try to import a .BZ2 file from OSM osm2pgsql silently fails but places this entry in the windows event log: Faulting application name: osm2pgsql.exe, version: 0.0.0.0, time stamp: 0x5c3fb2d5 Faulting module name: LIBBZ2.dll, version: 0.0.0.0, time stamp: 0x5b046100 Exception code: 0xc0000409 Fault offset: 0x00000000000133f0 Faulting process id: 0x3078 Faulting application start time: 0x01d4aec9e3915d57 Faulting application path: C:\osm2pgsql\osm2pgsql.exe Faulting module path: C:\osm2pgsql\LIBBZ2.dll Report Id: 991eb125-a060-4e49-a3f2-1a7537a6e704 Faulting package full name: Faulting package-relative application ID:

Is this a known issue? Workarounds? Better off using the 'nix version?

Thanks

woodpeck commented 5 years ago

It is not a known issue. A possible workaround is using the .pbf file instead of the .bz2 file, since the error seems to originate in the bz2 library, and pbf is faster to process anyway. The Linux version is not inherently better, but certainly gets more testing.

BadLambdaJamma commented 5 years ago

I'll try that and circle back with the result. Thank You

lonvia commented 5 years ago

@joto and I have stumbled upon this a couple of weeks ago actually. There is some issue with the bzip library from miniconda that we use in the appveyor build. We were not able to track down the problem using appveyor and lack the environment to test locally. Maybe @alirdn can help?

You can use pbf or xml.gz. Both work fine on windows afaik. You can compile your windows binary from scratch with a locally available bz2 library (the one from nuget is known to work better). Or use *nix, that's fine as well.

mmd-osm commented 5 years ago

If we had osm2pgsql as Docker image, you could simply run it on Windows 10 via Docker for Windows (with Hyper V support). This shouldn't be too difficult to set up, as most relevant commands to run are already included in .travis.yml anyway. See #583 for some ideas.

SomeoneElseOSM commented 5 years ago

For info https://github.com/Overv/openstreetmap-tile-server/blob/master/README.md is a docker image that contains osm2pgsql (and a whole lot of other stuff). It might be useful at least as a starting point. I didn't create it but did write some info around it here. Another option for Windows 10 is of course Windows Subsystem for Linux - that should "just work" if you're running Windows 10.

ElyDotDev commented 5 years ago

Hi all I just created two versions with two different bzip2 sources, Nuget and philr/bzip2-windows. Both of them failed to run with this error;

Reading in file: xxx.osm.bz2
Using XML parser.
node cache: stored: 0(-nan(ind)%), storage efficiency: -nan(ind)% (dense blocks: 0, sparse nodes: 0), hit rate: -nan(ind)%
Osm2pgsql failed due to ERROR: bzip2 error: read failed: -7

Also, the current build, which uses conda provided lib, will not produce any error. Just a crash dialog. So they behave differently.

I really have no idea currently about the main cause of the issue. I will investigate in next days to maybe find a solution for it.

You can find the build artifacts below:

Nuget philr/bzip2-windows

mboeringa commented 5 years ago

Another option for Windows 10 is of course Windows Subsystem for Linux - that should "just work" if you're running Windows 10.

I run Ubuntu 18.04.2 in Oracle Virtualbox 6.0 on a Windows 10 host, with PostgreSQL 11.2 / PostGIS 2.5.1 and osm2pgsql 0.96, and connect from the Windows 10 host to the Virtualbox PostgreSQL instance via the Windows ODBC drivers for PostgreSQL.

Has been running fine for the last two years, even going through multiple upgrades (e.g. Ubuntu, PostgreSQL, PostGIS).

I actually installed the Virtualbox instance on an external USB 3.1 connected SSD drive, and plug it in in my laptop or desktop depending on what I want to do. May not be the highest performing solution, but I still managed to load the whole of Europe in something like 28 hours or so, which is good enough for my purposes.

mmd-osm commented 4 years ago

I tried the artifact generated by Appveyor (https://ci.appveyor.com/project/openstreetmap/osm2pgsql/builds/29393880/job/iug627ib07u3ejxw), and ran an osm.bz2 using Wine (Windows emulator) on Linux. Processing gets stuck very early during XML parsing with the libbz2.dll included in the artifact.

wine ./osm2pgsql.exe  ... saarland.osm.bz2
osm2pgsql version 1.2.0 (64 bit id space)

Allocating memory for sparse node cache
Node-cache: cache=800MB, maxblocks=12800*65536, allocation method=1
Using built-in tag processing pipeline
Using projection SRS 3857 (Spherical Mercator)
Setting up table: planet_osm_point
Setting up table: planet_osm_line
Setting up table: planet_osm_polygon
Setting up table: planet_osm_roads

Reading in file: saarland.osm.bz2
Using XML parser.
0047:err:ntdll:RtlpWaitForCriticalSection section 0x7f71b8015d40 "?" wait timed out in thread 0047, blocked by 0000, retrying (60 sec)
0047:err:ntdll:RtlpWaitForCriticalSection section 0x7f71b8015d40 "?" wait timed out in thread 0047, blocked by 0000, retrying (60 sec)

In second step I replaced liibbz2.dll by the version 1.0.8.0 on https://github.com/philr/bzip2-windows/releases - this time processing was successful:

Reading in file: saarland.osm.bz2
Using XML parser.
Processing: Node(1214k 404.9k/s) Way(157k 52.33k/s) Relation(0 0.00/s)  parse time: 6s
Node stats: total(1214809), max(1992650611) in 3s
Way stats: total(188736), max(188645882) in 3s
Relation stats: total(1924), max(2545512) in 0s
Sorting data and creating indexes for planet_osm_point
node cache: stored: 1214809(100.00%), storage efficiency: 50.00% (dense blocks: 0, sparse nodes: 1214809), hit rate: 100.00%
Sorting data and creating indexes for planet_osm_line
Sorting data and creating indexes for planet_osm_polygon
Sorting data and creating indexes for planet_osm_roads
Copying planet_osm_roads to cluster by geometry finished
Creating geometry index on planet_osm_roads
Creating indexes on planet_osm_roads finishe

I think it might be worth trying a different (newer) libbz2.dll on Windows as well.

lonvia commented 4 years ago

@mmd-osm This looks promising. Could you make a PR that changes the appveyor script to use this library? I see that there are also development headers provided. So we should use those too. You can test if the setup works by adding another test to tests/test-output-pgsql.cpp that accepts liechtenstein in osm.bz2 format. I was able to trigger this windows bug with it last time I tried.

mmd-osm commented 4 years ago

(just a quick update, a real Win 10 installation behaves exactly as in the wine test above.)