mapbox / geobuf

A compact binary encoding for geographic data.
ISC License
980 stars 84 forks source link

Geobuf

Build Status

Geobuf is a compact binary encoding for geographic data.

Geobuf provides nearly lossless compression of GeoJSON data into protocol buffers. Advantages over using GeoJSON alone:

The encoding format also potentially allows:

Think of this as an attempt to design a simple, modern Shapefile successor that works seamlessly with GeoJSON. Unlike Mapbox Vector Tiles, it aims for nearly lossless compression of datasets — without tiling, projecting coordinates, flattening geometries or stripping properties.

Note that the encoding schema is not stable yet — it may still change as we get community feedback and discover new ways to improve it.

"Nearly" lossless means coordinates are encoded with precision of 6 digits after the decimal point (about 10cm).

Sample compression sizes

Data JSON JSON (gz) Geobuf Geobuf (gz)
US zip codes 101.85 MB 26.67 MB 12.24 MB 10.48 MB
Idaho counties 10.92 MB 2.57 MB 1.37 MB 1.17 MB

API

encode

var buffer = geobuf.encode(geojson, new Pbf());

Given a GeoJSON object and a Pbf object to write to, returns a Geobuf as UInt8Array array of bytes. In Node@4.5.0 or later, you can use Buffer.from to convert back to a buffer.

decode

var geojson = geobuf.decode(new Pbf(data));

Given a Pbf object with Geobuf data, return a GeoJSON object. When loading Geobuf data over XMLHttpRequest, you need to set responseType to arraybuffer.

Install

Node and Browserify:

npm install geobuf

Browser build CDN links:

Building locally:

npm install
npm run build-dev # dist/geobuf-dev.js (development build)
npm run build-min # dist/geobuf.js (minified production build)

Command Line

npm install -g geobuf

Installs these nifty binaries:

json2geobuf data.json > data.pbf
shp2geobuf myshapefile > data.pbf
geobuf2json data.pbf > data.json

Note that for big files, geobuf2json command can be pretty slow, but the bottleneck is not the decoding, but the native JSON.stringify on the decoded object to pipe it as a string to stdout. On some files, this step may take 40 times more time than actual decoding.

See Also