PublicMapping / districtbuilder

DistrictBuilder is web-based, open source software for collaborative redistricting.
https://districtbuilder.org
Apache License 2.0
64 stars 8 forks source link

Change serialization to use topobuf #1142

Closed maurizi closed 2 years ago

maurizi commented 2 years ago

Context

Work on the concurrency ADR showed that we can get significant improvements in file size by using a Protobuffer based file format - I used an outdated version of geobuf to do this - we'll be forking that in #1141 as topobuf

Notes

We changed from JSON -> the v8 serialization format in https://github.com/PublicMapping/districtbuilder/pull/1099, which should be able to serve as a template for this PR

AC:

maurizi commented 2 years ago

Well this was an all-around disappointment.

Testing in staging showed using topobuf instead of v8 deserialization consumes more memory: (left side is using v8, right is using pbf)

image.png

This is probably something we could fix in the topobuf library at the expense of some CPU cycles... but we're slower than v8 deserialization already 😞

topobuf:

districtbuilder-server-1  | cache-miss San Bernadino: 227.267ms
districtbuilder-server-1  | cache-miss Dane County: 634.601ms
districtbuilder-server-1  | cache-miss Delaware: 826.569ms
districtbuilder-server-1  | cache-miss New Jersey: 3.451s
districtbuilder-server-1  | cache-miss Wisconsin: 5.165s
districtbuilder-server-1  | cache-miss Pennsylvania: 14.393s
districtbuilder-server-1  | cache-miss Illinois: 10.048s

v8:

districtbuilder-server-1  | cache-miss San Bernadino: 137.446ms
districtbuilder-server-1  | cache-miss Dane County: 327.031ms
districtbuilder-server-1  | cache-miss Delaware: 597.973ms
districtbuilder-server-1  | cache-miss New Jersey: 3.232s
districtbuilder-server-1  | cache-miss Wisconsin: 5.475s
districtbuilder-server-1  | cache-miss Pennsylvania: 10.225s
districtbuilder-server-1  | cache-miss Illinois: 8.007s

File size is much much smaller with topobuf, but that is about the only thing it has going for us. Probably worth looking into topobuf further for potentially sending TopoJSON to the browser due to the much smaller file size, but ultimately I think we should stick with the current serialization method.