mocnik-science / osm-python-tools

A library to access OpenStreetMap related services
GNU General Public License v3.0
440 stars 48 forks source link

Significantly reduced performance since version 0.3.3 #71

Open sb-stefan opened 2 years ago

sb-stefan commented 2 years ago

I recently updated from version 0.2.8 to 0.3.5 and noticed a significant increase in querying time for overpass queries with the new version. Testing around with different longer queries showed an increase in time by a factor of 100-150x. Dug a bit in the source code and it seems to be an issue with the ApiResult class (in api.py) - and how it was rewritten since version 0.3.3. More specifically, how BeautifulSoup and the souphistory is handled now is decreasing OSMPythonTools performance significantly.

For reference:

new, slow version (>=0.3.3)

class ApiResult(Element):
    def __init__(self, xml, queryString, params, cacheMetadata=None, shallow=False, history=False):
        self._isValid = (xml != {} and xml is not None)
        self._xml = xml
        self._soup2 = None
        soup = None
        soupHistory = None
        if self._isValid:
            self._soup2 = BeautifulSoup(xml, 'xml').find('osm')
            soupHistory = self._soup2.find_all(['node', 'way', 'relation'])
            if len(soupHistory) > 0:
                soup = soupHistory[-1]
        super().__init__(cacheMetadata, soup=soup, soupHistory=soupHistory if history else None, shallow=shallow)
        self._queryString = queryString
        self._params = params

older, more efficient version (<0.3.3)

class ApiResult(Element):
    def __init__(self, xml, queryString, shallow=False):
        self._isValid = (xml != {} and xml is not None)
        self._xml = xml
        self._soup = None
        soupElement = None
        if self._isValid:
            self._soup = BeautifulSoup(xml, 'xml')
            if len(self._soup.find_all('node')) > 0:
                soupElement = self._soup.node
            if len(self._soup.find_all('way')) > 0:
                soupElement = self._soup.way
            if len(self._soup.find_all('relation')) > 0:
                soupElement = self._soup.relation
        super().__init__(soup=soupElement, shallow=shallow)
        self._queryString = queryString