openstates / enhancement-proposals

Open States Enhancement Proposals
1 stars 3 forks source link

Structured data for street addresses #21

Closed paulschreiber closed 3 years ago

paulschreiber commented 3 years ago

OSEP #6: Structured data for street addresses

Author(s) @paulschreiber
Implementer(s) TODO
Status Draft
Draft PR(s) https://github.com/openstates/enhancement-proposals/pull/18
Approval PR(s)
Created TODO
Updated TODO

Abstract

Open States provided currently provides a single field for address data. It would be beneficial to have separate street, city, state and zip code fields.

Within the current address field, the information available, spacing and delimiters vary by state, making parsing difficult.

Specification

Data Model Changes

Scraping

Data availability and format varies by state

Observed cases can easily be parsed with regular expressions.

Rationale

This would allow for easier searching, sorting, geocoding and other data uses.

Drawbacks

Implementation Plan

I will assist with updating scrapers to output the structured data.

The address field will be a composite of the structured data (using format strings).

Several team members, and hopefully some community members, will help to contribute updated committee scrapers.

Copyright

This document has been placed in the public domain per the Creative Commons CC0 1.0 Universal license.

jamesturk commented 3 years ago

Thanks for this!

Initial thoughts:

The rationale could use a bit more work. This is maybe only the second or third request for this in many years, I think that more clear examples of where the current irregularities cause problems would be helpful. Especially as we discuss alternatives prior to accepting this.

For example, one other alternative we discussed was keeping the address field, but having formal delimiters. If the proposal is deciding against that I think it should be discussed in the rationale.

I'd add to Drawbacks:

And one small copy/paste typo in implementation plan, "committee scrapers".

jamesturk commented 3 years ago

Also, if you could submit this as a PR, that'd let us have the discussion on the PR itself and have updates/etc. which would be useful I think.

jamesturk commented 3 years ago

closing in favor of #22