azavea / osmesa

OSMesa is an OpenStreetMap processing stack based on GeoTrellis and Apache Spark
Apache License 2.0
80 stars 26 forks source link

User footprint rendering improvements #97

Closed mojodna closed 5 years ago

mojodna commented 5 years ago

Namely, that it actually runs through.

When generating footprints for large numbers of users, many, many intermediate tiles are produced. Prior to SparseIntTile, each was backed by an array of rows * cols but was largely empty. Spark shuffle / spill serialization compressed the Array[Byte] representations of these backing arrays extremely well. However, Spark reliably OOM'd while deserializing and allocating these large backing arrays.

By switching to Map- and LongMap-backed IntTiles, memory usage is dramatically reduced (the Detroit sample went from a 2.7GB in-memory representation to 39MB) for this use-case (where tiles are largely empty) with the additional benefit of being able to pass the backing Maps directly to Spark without an intermediate deserialization step.