ellenhp / airmail

Lightweight geocoder in pure Rust
https://airmail.rs/
Apache License 2.0
292 stars 3 forks source link

Import TIGER address ranges #4

Open ellenhp opened 5 months ago

ellenhp commented 5 months ago

This is going to be a tough one but I'm going to give it a go.

Edit: I'm going to double-check that it's necessary first though, like if index all of OpenAddresses do I get TIGER data for free?

riordan commented 5 months ago

if index all of OpenAddresses do I get TIGER data for free?

Sadly, no. OpenAddresses provides points/parcels/buildings for addresses, scraped from various open data releases (and a few ESRI parcel servers). If a government doesn't provide that address point data, it doesn't exist to OpenAddresses. Take a look around the map on the openaddresses homepage; once you zoom in, you'll see a lot of gaps from places that haven't released address data (e.g. municipalities who have their official GIS rights locked up to an exclusive vendor who sells to data brokers).

TIGER's got more US address coverage than OpenAddresses, since its interpolating the street network, which makes it a great fallback for when there isn't OpenAddresses data.

(admittedly, this is all from like 10 years ago when I briefly worked on Pelias so take it with a grain of salt).

ellenhp commented 5 months ago

Thanks for the info! That's a bummer. It'll be an interesting one to solve. I've been thinking about trying to align more closely with the pelias ES schema which could let me reuse its importers instead of just the spatial server and WOF hierarchy stuff. A lot of pelias features are likely beyond the scope of airmail given the fact that I want to retain the ability to search against an index in object storage though. Like if TIGER ranges are handled at runtime instead of import time I might be out of luck.

ellenhp commented 5 months ago

It looks like I should be able to do this with the preprocessed TIGER data designed for importing into nominatim and a little georust.