Open helgridly opened 4 years ago
@helgridly
Operating Expenditures The amount reported on the Detailed Summary Page can be larger than the sum of the itemized operating expenditures disclosed on Schedule B. Committees are not required to itemize payments to a person/vendor that aggregate to $200 or less in a calendar year, but they are required to include those expenditures in the amount disclosed on the Detailed Summary Page.
Independent Expenditures The amount reported on the Detailed Summary Page can be larger than the sum of the itemized independent expenditures disclosed on Schedule E. Committees are not required to itemize payments to a person/vendor that aggregate to $200 or less in a calendar year, but they are required to include those expenditures in the amount disclosed on the Detailed Summary Page.
Other Disbursements Currently we don't have an "other disbursements" bulk data file.
Committees transferring funds to themselves This is most likely filer error where the filing committee has disclosed their own name in the payee section instead of the receiving committee's name.
Thanks -- this is very helpful. I can slurp the other disbursements from the API instead.
One followup question: is there a way to calculate the aggregate "other disbursements" amount for some committee from other bulk data? Maybe TTL_DISB
on the PAC and party summary file minus some other values or aggregates? If not I can get it from the API too, just wanted to check.
You're welcome. Is this what you are looking for?
https://www.fec.gov/files/bulk-downloads/2016/committee_summary_2016.csv -See Column AB (OTHER_DISB) https://www.fec.gov/campaign-finance-data/committee-summary-file-description/
Find this file on this page: https://www.fec.gov/data/browse-data/?tab=committees
Oh, interesting - I didn't know those were there too. That looks very similar to the PAC and party summary file bulk data, with a few minor differences in columns. Are those CSVs a better source of truth than the bulk data?
The committee summary file includes candidate committees (committee designation "P" and "A") while the PAC and party summary file (webk) does not. WEBK does include candidate joint fundraising committees (committee designation "J"). The committee summary file includes more data fields. We began generating the committee summary file during the 2007-08 cycle. Both files are generated nightly.
At the time we decided to keep generating WEBK for then current users of that file. It might be time to consider retiring WEBK.
Here's a field comparison of the two files: committee summary | PAC and committee summary (webk) |
---|---|
Link_Image | |
CMTE_ID | CMTE_ID |
CMTE_NM | CMTE_NM |
CMTE_TP | CMTE_TP |
CMTE_DSGN | CMTE_DSGN |
CMTE_FILING_FREQ | CMTE_FILING_FREQ |
CMTE_ST1 | |
CMTE_ST2 | |
CMTE_CITY | |
CMTE_ST | |
CMTE_ZIP | |
TRES_NM | |
CAND_ID | |
FEC_ELECTION_YR | |
INDV_CONTB | INDV_CONTRIB |
PTY_CMTE_CONTB | |
OTH_CMTE_CONTB | OTHER_POL_CMTE_CONTRIB |
TTL_CONTB | |
TRANF_FROM_OTHER_AUTH_CMTE | TRANS_FROM_AFF |
OFFSETS_TO_OP_EXP | |
OTHER_RECEIPTS | |
TTL_RECEIPTS | TTL_RECEIPTS |
TRANF_TO_OTHER_AUTH_CMTE | TRANF_TO_AFF |
OTH_LOAN_REPYMTS | |
INDV_REF | INDV_REFUNDS |
POL_PTY_CMTE_REF | OTHER_POL_CMTE_REFUNDS |
TTL_CONTB_REF | |
OTHER_DISB | |
TTL_DISB | TTL_DISB |
NET_CONTB | |
NET_OP_EXP | |
COH_BOP | COH_BOP |
CVG_START_DT | |
COH_COP | COH_COP |
CVG_END_DT | CVG_END_DT |
DEBTS_OWED_BY_CMTE | DEBTS_OWED_BY |
DEBTS_OWED_TO_CMTE | |
INDV_ITEM_CONTB | |
INDV_UNITEM_CONTB | |
OTH_LOANS | |
TRANF_FROM_NONFED_ACCT | |
TRANF_FROM_NONFED_LEVIN | |
TTL_NONFED_TRANF | NONFED_TRANS_RECEIVED |
LOAN_REPYMTS_RECEIVED | |
OFFSETS_TO_FNDRSG | |
OFFSETS_TO_LEGAL_ACCTG | |
FED_CAND_CONTB_REF | |
TTL_FED_RECEIPTS | |
SHARED_FED_OP_EXP | |
SHARED_NONFED_OP_EXP | NONFED_SHARE_EXP |
OTHER_FED_OP_EXP | |
TTL_OP_EXP | |
FED_CAND_CMTE_CONTB | CONTRIB_TO_OTHER_CMTE |
INDT_EXP | IND_EXP |
COORD_EXP_BY_PTY_CMTE | PTY_COORD_EXP |
LOANS_MADE | |
SHARED_FED_ACTVY_FED_SHR | |
SHARED_FED_ACTVY_NONFED | |
NON_ALLOC_FED_ELECT_ACTVY | |
TTL_FED_ELECT_ACTVY | |
TTL_FED_DISB | |
CAND_CNTB | CAND_CONTRIB |
CAND_LOAN | CAND_LOANS |
TTL_LOANS | TTL_LOANS_RECEIVED |
OP_EXP | |
CAND_LOAN_REPYMNT | CAND_LOAN_REPAY |
TTL_LOAN_REPYMTS | LOAN_REPAY |
OTH_CMTE_REF | |
TTL_OFFSETS_TO_OP_EXP | |
EXEMPT_LEGAL_ACCTG_DISB | |
FNDRSG_DISB | |
ITEM_REF_REB_RET | |
SUBTTL_REF_REB_RET | |
UNITEM_REF_REB_RET | |
ITEM_OTHER_REF_REB_RET | |
UNITEM_OTHER_REF_REB_RET | |
SUBTTL_OTHER_REF_REB_RET | |
ITEM_OTHER_INCOME | |
UNITEM_OTHER_INCOME | |
EXP_PRIOR_YRS_SUBJECT_LIM | |
EXP_SUBJECT_LIMITS | |
FED_FUNDS | |
ITEM_CONVN_EXP_DISB | |
ITEM_OTHER_DISB | |
SUBTTL_CONVN_EXP_DISB | |
TTL_EXP_SUBJECT_LIMITS | |
UNITEM_CONVN_EXP_DISB | |
UNITEM_OTHER_DISB | |
TTL_COMMUNICATION_COST | |
COH_BOY | |
COH_COY | |
ORG_TP |
Paul - this is super helpful. I really appreciate the deep dive here. It sounds like downloading the CSV files (where available) will provide me with more complete data than the bulk data files. Would you say that's generally correct?
Ultimately all I'm trying to do here is populate a relational database so I can join across receipts, committees, and expenditures. Is there any chance of getting a snapshot of your database?
Hussein - I haven't forgotten about your request. I'm not sure if it's possible. I'll talk with our DBA team about it tomorrow.
Thanks. I know it's a long shot, but I figured I'd ask :)
Hi Paul - I'm about to start writing scripts to slurp all of schedule_a and schedule_b from the API (for the 2016 cycle). Let me know if a database snapshot is possible (or read-only credentials, or something) if you want to save me time and your server some effort :)
Hi, Hussein - We should be able to provide you with 2016 SA and SB data. I'm CCing someone (@rjayasekera ) from our database team. He might have some questions.
Hi again, Hussein. We can get the files to you by Tuesday (7/7) afternoon. Much of the team is currently involved in PI planning and this is a short week because of the holiday.
This is amazing! Thank you!
@josephkuan @rohan @paul good morning. could you please help to review the followings: @josephkuan please copy to the FTP sites from 029's /u07/dzhang/. It has join "ofec_committee_history_mv to get 61 additional columns' API query data which is same with our sqls returns from API sql tracking. And it has the full data of fec_fitem_sched_a/b_2015_2016 as usual. Same rows account but with more data of API return. There is a README.txt on how to use it. -rw-r--r-- 1 oracle oracle 4477615794 Jul 2 09:06 sa_2016.csv.gz (4Gb) -rw-r--r-- 1 oracle oracle 1109410451 Jul 2 09:08 sb_2016.csv.gz (1Gb) -rw-r--r-- 1 oracle oracle 555 Jul 2 09:25 s_2016README.txt
Hi, @helgridly . The DBA was able to get the data files together this morning. There are three files. Their names and location are below. Please let us know if you have any questions.
https://www.fec.gov/files/bulk-downloads/index.html?prefix=bulk-downloads/sch-a-dump/
2020-07-02T13:55:17.000Z 584 Bytes s_2016README.txt 2020-07-02T13:45:26.000Z 4.48 GB sa_2016.csv.gz 2020-07-02T13:53:07.000Z 1.11 GB sb_2016.csv.gz
This is great - thank you so much. Do you have the datatypes for the columns? I can probably figure them out but if you have a CREATE TABLE statement handy that'd save me some time.
Hussein, the read me file (s_2016README.txt) now includes the create table scripts.
This is awesome -- thank you. I managed to get it loaded in and I'm happily digging away now :)
I'm aware I'm asking a ton of questions. I am trying to learn via the guides to spare you as many as I can. If it would be better for me to go ask them at the Google Group I'd be happy to do so.
That said -- I am digging through the Schedule A data and seeing some unexpected transaction types in the receipt_tp
column. I know the transaction types are documented here, but can you tell me more about how a transaction gets its type assigned? It doesn't look like they're provided by the filter, so I assume it's done by the FEC on receipt; is there documentation on that process? (I'd also be happy looking at code, if that's easier.) I'm having trouble understanding some of the assignments I'm seeing and would like to better understand whether they're misclassifications or me missing some nuance.
Hussein, the attached pfd describes how we make categorization decisions for transactions disclosed on schedules A and B of Forms 3, 3P and 3X.
Hi, I'm back. Sorry :)
So I've been poking through the 2016 dump you gave me and generating a list of questions. One huge thing just jumped out at me. The vast majority of records for Schedule A and Schedule B have their transaction types set to NULL (I'm checking the columns receipt_tp
and disb_tp
respectively). Did they just never get coded? Should I attempt to code them myself based on the line number, per the transaction categorization pdf you provided, or will that just lead me to madness? Or should I just use the line number and skip the transaction type entirely?
edit:\ I got an answer via the mailing list - transaction codes have been deprecated for some time now. I should be using line numbers and disbursement category codes instead. Feel free to ignore this question :)
Hi, Hussein.
Many of the transactions are not coded based on our coding procedures. For example, transactions reported on Line 11c (contributions from other committees/PACs) of Form 3 are generally not coded. These transactions are instead coded on the PAC reports, Line 23 (Contributions to Federal Candidates/Committees and Other Political Committees) of Form 3X. Another example, operating expenditures are not coded. Our coding scheme is more focused on receipts because most of our registrations focus on receipts.
I’ll go though this more systematically and post here. This will take me a few days.
Hi folks,
I've loaded the bulk data for the 2015-2016 period into a local database using the schemas provided and am trying to align the numbers in the bulk data with the corresponding aggregate numbers in the financial summary on the web interface. I'm finding some minor discrepancies and would like to know where I'm going wrong.
Please feel free to just link me to the code for these aggregations, if it's available -- that'd probably help answer other questions I might have in the future too :)
As an example, I'm using committee
C00547349
and the 2016 cycle, available on the web interface here.1. Operating Expenditures
Using the bulk data labelled "operating expenditures" (oppexp16.zip):
gives me 39,155,429.75, whereas the web interface says total operating expenditures lists 39,178,395.10.
2. Independent Expenditures
Using the bulk data labelled "Contributions from committees to candidates & independent expenditures" (pas216.zip):
gives me 9,784,721.00 vs the web interface's 9,784,794.95 (so close!).
3. Other Disbursements
The web interface lists "Other disbursements". How do I find these in the bulk data?
4. Committees transferring funds to themselves
In the "Any transaction from one committee to another" data (oth16.zip), there are plenty of records where
cmte_id
andother_id
are identical. Can you give me a sense of what these transactions might represent?Thanks so much! I really appreciate your work!