Open mboeringa opened 7 years ago
Hmm, I noticed another thing: Due to the errors, I have been reducing the size of the extent to download. However, after three steps in reducing the size of the extent, I still get query time-outs.
Interestingly though, if I use QGIS's almost equivalent "Download Data" menu option, and download an extent even larger than the smallest one that failed in ArcGIS, I do get a valid osm XML file. The file is also downloaded much faster.
I double checked the XML files header, and although the exact server is not listed there, both use of course Overpass API 0.6, so there really doesn't seem a difference here:
ArcGIS:
QGIS:
Is there some inefficiency in the calls that ArcGIS makes to download form the Overpass API, that causes the slower download and query time-outs?...
The download request is happening right here https://github.com/Esri/arcgis-osm-editor/blob/master/src/data/download_using_xapi.py#L61
The download request is happening right here https://github.com/Esri/arcgis-osm-editor/blob/master/src/data/download_using_xapi.py#L61
That seems straightforward enough... Currently searching in the QGIS repository if I can find the equivalent call there.
The dialogs interface seems coded here: https://github.com/qgis/QGIS/blob/bb9e276d591594821e2368f6ca077850baf1220b/src/app/openstreetmap/qgsosmdownloaddialog.cpp
It references the
#include "qgsosmdownload.h"
module that seems to do the actual work of the download. It is here:
If I interpret the code well, it is using Overpass Query Language to fetch the extent.
So there does seem to be a difference in the exact way the Overpass API is called, and QGIS's method seems more efficient? I do think I have seen references to Overpass Query Language also being the "new" preferred way to access Overpass API...
It might be an idea to switch to Overpass Query Language as well?
It might be an idea to switch to Overpass Query Language as well?
Right, this is a highly recommended step. XAPI is totally outdated these days and only supported by a compatibility layer, which translates XAPI into Overpass QL. The following wiki page has some details on how to migrate to Overpass QL. Also, your code doesn't seem to send an appropiate User-Agent HTTP header, which is highly recommended as well.
Hi Thomas,
While scaling up some test and using the Download OSM Data (XAPI) tool for creating a large custom extent of the centre of Paris, I noticed it failed with an "Out of memory" error, see screenshot below:
As I had plenty of RAM free left, and disk space as well, I decided to try the exact same download again, but this time monitoring the memory usage of ArcMap.
One thing that was immediately obvious, was that the tool indeed attempted to store all temporary data in memory, I slowly but inevitably saw memory usage rising from an initial 100 MB or so, to close to 1 GB when this second run apparently finished successfully:
Now I have seen it finish successfully before, and we discussed this as well, but with large extents, tool success is not necessarily equal to a successful download. I opened the file in Visual Studio, as I have done before and which is the only tool I have capable of opening XML files this big, and noticed at the end of the file the familiar remark of the query time-out, indicating that despite completion of the download tool, the file was incomplete.
I nonetheless decided to try to import it using the OSM File Loader (Load only) tool, primarily as a test for the commit associated with issue https://github.com/Esri/arcgis-osm-editor/issues/151, that implemented some code to report the remark element.
That seemed to work, the remark was detected and reported at the beginning of the process:
Now there is just two questions raised by this new issue:
Would it be possible to cache the downloaded data to disk instead of RAM memory, at least after x MB is downloaded, so as to avoid a potential "Out of memory" error?
Although the partial download of XML is now properly reported in the OSM File Loader (Load only) tool, I think it might be better to make the message much more explicit and descriptive. Instead of only literally reporting the query time-out message, it should give some context, like:
"The OSM File Loader (Load only) tool has detected a remark element in the osm XML file you selected. This usually indicates a failure to download a complete osm XML file, meaning nodes, ways or relations may be missing in the downloaded geographic extent. Such errors may be caused by query time-outs of the XAPI server on large extents.
Remark element: ELEMENT"
The problem with the current implementation is that the reported remark element, may well be mistaken for an error in the OSM File Loader (Load only) itself, while the actual issue happened during the download before running the import tool!