azavea / osmesa

OSMesa is an OpenStreetMap processing stack based on GeoTrellis and Apache Spark
Apache License 2.0
80 stars 26 forks source link

Use a SAX parser for change/changeset XML data #119

Closed mojodna closed 5 years ago

mojodna commented 5 years ago

We've run into at least one case where change XML is too large (46MB compressed) to be processed effectively with scala.xml. Conversion to case classes should be done using a SAX parser instead.

Originally posted by @mojodna in https://github.com/azavea/osmesa/pull/106

mojodna commented 5 years ago

It doesn't necessarily need to be a full SAX parser; when events for (XML) nodes occur within create, modify, or delete elements, the individual bodies (<node id=...>...</node>, <way id=...>...</way>, <relation id=...>...</relation>) can be passed to scala.xml for processing (since these are small, there's just the potential for there to be lots of them).

The XML format is OsmChange.