In an effort to make companies, with the same CIK, report back correctly a format change was made in the name to put all of the names as uppercase without any punctuation. This is a weak implementation as companies could change their names slightly and cause a need to reformat the code again. A better approach is needed.
Proposed approach
For EDGAR the durable identifier is the Central Index Key (CIK). This identifier should be used instead of the name as the name can change even for public companies. The present code for temporarily tracking a company is:
# If we've seen this company before then add the form, otherwise include both firmographics and the initial form definition
if tmp_companies.get(company_name) == None:
tmp_companies[company_name] = company_info
tmp_companies[company_name]['forms'] = {accession_key: form}
else:
tmp_companies[company_name]['forms'][accession_key] = form
The proposed change could look something like this:
# If we've seen this company before then add the form, otherwise include both firmographics and the initial form definition
if tmp_companies.get(cik_no) == None:
tmp_companies[cik_no] = company_info
tmp_companies[cik_no]['forms'] = {accession_key: form}
else:
tmp_companies[cik_no]['forms'][accession_key] = form
Since company_info is a dict() that also keeps the companyName attribute the bookkeeping of the name is ok there. Because modules that make use of these data require a dict() keyed on companyName a function to rekey based upon companyName is needed. This function would loop over all cik_no keys, replace them with companyName and return a new dict(). The exact details of this change are left to the time of implementation.
Introduction
In an effort to make companies, with the same CIK, report back correctly a format change was made in the name to put all of the names as uppercase without any punctuation. This is a weak implementation as companies could change their names slightly and cause a need to reformat the code again. A better approach is needed.
Proposed approach
For EDGAR the durable identifier is the Central Index Key (CIK). This identifier should be used instead of the name as the name can change even for public companies. The present code for temporarily tracking a company is:
The proposed change could look something like this:
Since
company_info
is a dict() that also keeps thecompanyName
attribute the bookkeeping of the name is ok there. Because modules that make use of these data require a dict() keyed oncompanyName
a function to rekey based uponcompanyName
is needed. This function would loop over allcik_no
keys, replace them withcompanyName
and return a new dict(). The exact details of this change are left to the time of implementation.