Less memory-consuming xml parsing

DinoTools / python-overpy

Python Wrapper to access the Overpass API

https://python-overpy.readthedocs.io/

MIT License

242 stars 58 forks source link

Less memory-consuming xml parsing #15

Closed dbuse closed 9 years ago

dbuse commented 9 years ago

Currently the whole xml-result is first parsed into a xml.etree.ElementTree and than processed to create overpy structures. While this is perfectly fine for small amounts of data, larger files or requests consume a lot of memory that is not freed after the overpy result is constructed.

A SAX-style parser could reduce the memory footprint and both overpy's architecture and osm_xml's structure would easily support such a parser.

phibos commented 9 years ago

Looks like some people have to work with very large datasets. So using the SAX parser might be a good idea. I have scheduled this feature for the next version.

domlysz commented 8 years ago

It's also possible to use iterparse function of ElementTree module: memory footprint and run speed are similar to SAX parser bit it's less verbose.

https://github.com/domlysz/python-overpy/commit/a18ae32b20f87959737c27857782d81ae872a7cd?diff=unified

Maybe it can be a good choice if you want to maintain only one parser. I can make pull request if you're interesting.