mapbox / geobuf

A compact binary encoding for geographic data.
ISC License
967 stars 84 forks source link

feature: encode null property values more efficiently #95

Closed ivorblockley closed 6 years ago

ivorblockley commented 6 years ago

geobuf currently encodes any property with a null value as a json_value: "[null]" in the protobuf Value messages.

Since Value makes use of a oneof value_type, it is permissible to allow missing (unset) value_types. Making use of these to encode null property values is a trivial and logical optimisation that reduces the size of serialized geobuf messages that contain such fields.

With this change the size of the test/fixutres/issue90.json that contains 4 null properties is reduced from 117 bytes to 99 bytes. This optimisation can make a difference in real-world datasets, for example the Invasive_Species.geojson geobuf (using default precision=6) shrinks from 266877 bytes to 218505 with this update (although admittedly following gzip compression the saving is modest: from 34862 bytes to 34119 ... yes this dataset is very amenable to compression via gzip!).

mourner commented 6 years ago

Sorry for a late response, this looks great!