openva / crump

A parser for the Virginia State Corporation Commission's business registration records.
https://vabusinesses.org/
MIT License
20 stars 3 forks source link

Reformat field names #24

Closed waldoj closed 10 years ago

waldoj commented 10 years ago

The field names can be made more friend, without making any serious changes.

Each file uses the same prefix for each field. For instance, 3_lp.csv has the following headers:

CORP-FORMED,CORP-ID,CORP-NAME,CORP-STATUS,CORP-STATUS-DATE,CORP-PER-DUR,CORP-INC-DATE,CORP-STATE-INC,CORP-IND-CODE,CORP-STREET1,CORP-STREET2,CORP-CITY,CORP-STATE,CORP-ZIP,CORP-PO-EFF-DATE,CORP-RA-NAME,CORP-RA-STREET1,CORP-RA-STREET2,CORP-RA-CITY,CORP-RA-STATE,CORP-RA-ZIP,CORP-RA-EFF-DATE,CORP-RA-STATUS,CORP-RA-LOC

There is no value to this when the files are broken out. (I'm not sure that there's any value to it when the files are combined, but that's neither here nor there.) Eliminate the uniform prefixes.

Also, there's no benefit to these being in all caps. If they're going to be of uniform case, they might as well be in lowercase.

waldoj commented 10 years ago

I'm no longer convinced that we should drop these field prefixes. The use case that I have in mind is Elasticsearch, into which we'll inevitably deposit these files unceremoniously. I suspect that it will make things easier to have table identifiers as field prefixes. So, at least for the time being, believe it's best to let this be.