topojson / us-atlas

Pre-built TopoJSON from the U.S. Census Bureau.
https://observablehq.com/@d3/u-s-map
ISC License
859 stars 139 forks source link

Use JSONStream instead of native JSON parser for big files #26

Closed callumacrae closed 7 years ago

callumacrae commented 7 years ago

./bin/geomerge < shp/us/zipcodes-unmerged.json > shp/us/zipcodes.json was failing for me, because zipcodes-unmerged.json was 1.4GB big and v8 only allows strings to be 256MB big.

I rewrote bin/geomerge to use a streaming JSON parser instead of the native parser, which avoids loading the entire thing into memory. Doesn't solve the problem of the resulting JSON being 1.2GB big, but could solve other issues…

It removes the spliceIndex functionality, because we no longer have the index. I guess I can add a counter if that functionality is really wanted: should I add it back?

mbostock commented 7 years ago

Thanks for the contribution. In the last week or two, I’ve rewritten the whole d3-geo / d3-geo-projection / TopoJSON toolchain and adopted ndjson, which allows individual features to be parsed and processed. This should help a lot with large inputs, though I haven’t tested ZCTAs yet. See d3-geo-projection 1.1 and the new topojson organization for details.