palewire / django-calaccess-raw-data

A Django app to download, extract and load campaign finance and lobbying activity data from the California Secretary of State's CAL-ACCESS database
http://django-calaccess.californiacivicdata.org/
MIT License
64 stars 143 forks source link

Figure out other entity_cd values on ExpnCd #236

Closed aboutaaron closed 9 years ago

aboutaaron commented 9 years ago

CAL-Access Documentation has this for entity_cd:

Values: [COM|RCP] - Recipient Committee; IND - Individual; OTH - Other

Based on the data, ExpnCd.objects.values('entity_cd').distinct() returns the following:

[{'entity_cd': u'OTH'}, {'entity_cd': u''}, {'entity_cd': u'COM'}, {'entity_cd': u'IND'}, {'entity_cd': u'RCP'}, {'entity_cd': u'PTY'}, {'entity_cd': u'SCC'}, {'entity_cd': u'0'}, {'entity_cd': u'BNM'}, {'entity_cd': u'CAO'}, {'entity_cd': u'OFF'}, {'entity_cd': u'PTH'}, {'entity_cd': u'RFD'}, {'entity_cd': u'MBR'}]

I'm updating the choices tuple with these values:

        ('PTY', 'PTY - Unknown'),
        ('SCC', 'SCC 0 Unknown'),
        ('BNM', 'BNM - Unknown'),
        ('CAO', 'CAO - Unknown'),
        ('OFF', 'OFF - Unknown'),
        ('PTH', 'PTH - Unknown'),
        ('RFD', 'RFD - Unknown'),
        ('MBR', 'MBR - Unknown'),
        ('0', '0 - Unknown'),

Just need to figure out what they represent

aboutaaron commented 9 years ago

Possible related to #219

rkiddy commented 9 years ago

These are from pages 9 to 11 of the cal_format_201.pdf file which comes with the raw data from the SoS office.

This document specifies what these mean in which forms. Now, verifying that the codes are only used in the correct forms with the expn table is an exercise left for the reader, or to be done later.

mysql> select entity_cd, count(*) from expn group by entity_cd order by entity_cd;
+-----------+----------+
| entity_cd | count(*) |
+-----------+----------+
|           |   154777 |
| 0         |       85 |
| BNM       |      975 | -> Ballot Measure's Name/Title
| CAO       |        5 | -> Cand/Officeholder
| COM       |  1397113 | -> Committee
| IND       |   742666 | -> Person (spending > $5000)
| MBR       |        2 | -> Member of Association
| OFF       |       59 | -> Officer
| OTH       |  2175361 | -> Other
| PTH       |        2 | -> perhaps an error (PTY = Party, PTN = Partner)
| PTY       |    19516 | -> Party
| RCP       |    90403 | -> Recipient Committee
| RFD       |        2 | -> a "Type of Payment" = "returned contribution" mistakenly used, see page 13.
| SCC       |      954 | -> Small Contributor Committee
+-----------+----------+
aboutaaron commented 9 years ago

You're the man Ray. Thanks for this. We should probably have this tuple live somewhere central so we can reference between multiple models. 


Aaron Williams Journalist/Developer @aboutaaron

On Fri, Mar 20, 2015 at 5:26 PM -0700, "Ray Kiddy" notifications@github.com wrote:

These are from pages 9 to 11 of the cal_format_201.pdf file which comes with the raw data from the SoS office.

This document specifies what these mean in which forms. Now, verifying that the codes are only used in the correct forms with the expn table is an exercise left for the reader, or to be done later.

mysql> select entitycd, count() from expn group by entity_cd order by entity_cd; +-----------+----------+ | entitycd | count() | +-----------+----------+ | | 154777 | | 0 | 85 | | BNM | 975 | -> Ballot Measure's Name/Title | CAO | 5 | -> Cand/Officeholder | COM | 1397113 | -> Committee | IND | 742666 | -> Person (spending > $5000) | MBR | 2 | -> Member of Association | OFF | 59 | -> Officer | OTH | 2175361 | -> Other | PTH | 2 | -> perhaps an error (PTY = Party, PTN = Partner) | PTY | 19516 | -> Party | RCP | 90403 | -> Recipient Committee | RFD | 2 | -> a "Type of Payment" = "returned contribution" mistakenly used, see page 13. | SCC | 954 | -> Small Contributor Committee +-----------+----------+

— Reply to this email directly or view it on GitHub.