Esri / arcgis-osm-editor

ArcGIS Editor for OpenStreetMap is a toolset for GIS users to access and contribute to OpenStreetMap through their Desktop or Server environment.
Apache License 2.0
395 stars 129 forks source link

Out of memory error with "Download OSM Data (XAPI)" tool #160

Open mboeringa opened 7 years ago

mboeringa commented 7 years ago

Hi Thomas,

While scaling up some test and using the Download OSM Data (XAPI) tool for creating a large custom extent of the centre of Paris, I noticed it failed with an "Out of memory" error, see screenshot below:

download_osm_data_xapi-out_of_memory_error

As I had plenty of RAM free left, and disk space as well, I decided to try the exact same download again, but this time monitoring the memory usage of ArcMap.

One thing that was immediately obvious, was that the tool indeed attempted to store all temporary data in memory, I slowly but inevitably saw memory usage rising from an initial 100 MB or so, to close to 1 GB when this second run apparently finished successfully:

download_osm_data_xapi-successful_run

Now I have seen it finish successfully before, and we discussed this as well, but with large extents, tool success is not necessarily equal to a successful download. I opened the file in Visual Studio, as I have done before and which is the only tool I have capable of opening XML files this big, and noticed at the end of the file the familiar remark of the query time-out, indicating that despite completion of the download tool, the file was incomplete.

I nonetheless decided to try to import it using the OSM File Loader (Load only) tool, primarily as a test for the commit associated with issue https://github.com/Esri/arcgis-osm-editor/issues/151, that implemented some code to report the remark element.

That seemed to work, the remark was detected and reported at the beginning of the process:

download_osm_data_xapi-invalid_xml

Now there is just two questions raised by this new issue:

"The OSM File Loader (Load only) tool has detected a remark element in the osm XML file you selected. This usually indicates a failure to download a complete osm XML file, meaning nodes, ways or relations may be missing in the downloaded geographic extent. Such errors may be caused by query time-outs of the XAPI server on large extents.

Remark element: ELEMENT"

The problem with the current implementation is that the reported remark element, may well be mistaken for an error in the OSM File Loader (Load only) itself, while the actual issue happened during the download before running the import tool!

mboeringa commented 7 years ago

Hmm, I noticed another thing: Due to the errors, I have been reducing the size of the extent to download. However, after three steps in reducing the size of the extent, I still get query time-outs.

Interestingly though, if I use QGIS's almost equivalent "Download Data" menu option, and download an extent even larger than the smallest one that failed in ArcGIS, I do get a valid osm XML file. The file is also downloaded much faster.

I double checked the XML files header, and although the exact server is not listed there, both use of course Overpass API 0.6, so there really doesn't seem a difference here:

ArcGIS: download_osm_data_arcgis

QGIS: download_osm_data_qgis

Is there some inefficiency in the calls that ArcGIS makes to download form the Overpass API, that causes the slower download and query time-outs?...

ThomasEmge commented 7 years ago

The download request is happening right here https://github.com/Esri/arcgis-osm-editor/blob/master/src/data/download_using_xapi.py#L61

mboeringa commented 7 years ago

The download request is happening right here https://github.com/Esri/arcgis-osm-editor/blob/master/src/data/download_using_xapi.py#L61

That seems straightforward enough... Currently searching in the QGIS repository if I can find the equivalent call there.

mboeringa commented 7 years ago

The dialogs interface seems coded here: https://github.com/qgis/QGIS/blob/bb9e276d591594821e2368f6ca077850baf1220b/src/app/openstreetmap/qgsosmdownloaddialog.cpp

It references the

#include "qgsosmdownload.h"

module that seems to do the actual work of the download. It is here:

https://github.com/qgis/QGIS/blob/8ac1460d0e760b288ca20e88529737c3b5cde137/python/analysis/openstreetmap/qgsosmdownload.sip

If I interpret the code well, it is using Overpass Query Language to fetch the extent.

So there does seem to be a difference in the exact way the Overpass API is called, and QGIS's method seems more efficient? I do think I have seen references to Overpass Query Language also being the "new" preferred way to access Overpass API...

It might be an idea to switch to Overpass Query Language as well?

mmd-osm commented 6 years ago

It might be an idea to switch to Overpass Query Language as well?

Right, this is a highly recommended step. XAPI is totally outdated these days and only supported by a compatibility layer, which translates XAPI into Overpass QL. The following wiki page has some details on how to migrate to Overpass QL. Also, your code doesn't seem to send an appropiate User-Agent HTTP header, which is highly recommended as well.

https://wiki.openstreetmap.org/wiki/Overpass_API/XAPI_Compatibility_Layer#Migrating_from_XAPI_Compatibility_Layer_to_Overpass_XML_.2F_QL