biglocalnews / warn-transformer

Consolidate, enrich and republish the data gathered by warn-scraper
https://warn-transformer.readthedocs.io
Apache License 2.0
5 stars 3 forks source link

Standardize to canonical company name #19

Open ydoc5212 opened 3 years ago

ydoc5212 commented 3 years ago

Using OpenCorporates API? (https://api.opencorporates.com/)

ydoc5212 commented 3 years ago

Note for CT company name pre-processing: -use regex filter to remove parentheticals and asterisks marking updated revisions {eg Sodexo (Updated Notice)*}

ydoc5212 commented 3 years ago

~- [ ] see if opencorporates have python API client~

~- [ ] send email, CC serdar, asking for api key~

Ash1R commented 1 year ago

How would we use this API for this task? Is it to remove things like "Updated Notice" by finding the company name in a string, or would we use CorporateGroupings to find constituent companies, or something else?