spendright / msd

Merge SpendRight scraper data
Apache License 2.0
0 stars 1 forks source link

company name corrections #48

Closed coyotemarin closed 7 years ago

coyotemarin commented 7 years ago

This gets rid of the hard-coded company name corrections, in favor of being able to do the same thing with company_name table entries. This fixes #49.

Tried on my dataset, seems to work. Some tests, but nothing comprehensive; more like regression tests.

Still need to update docs, will do that in a later pull.

Also made the msd tool take a -V (version) option and run with zero inputs, and removed ./ from scraper IDs (see #46).

Made all-caps company names preferable, which is related to #47.

Also added a correction for "... Corp" (should be "Corp."), and got rid of reusing the scratch table between runs (too error prone).