niannucci / sds192-mp2

A project I completed in Introduction to Data Science (SDS192) at Smith College. Data wrangling with dplyr of election contribution data. Comparing contributions spent supporting and opposing candidates in each state based on political party.
0 stars 0 forks source link

sds192-mp2

Mini-project 2:

See (https://beanumber.github.io/sds192/mod_data.html) for the project instructions

load("house_elections.rda")
load("candidates.rda")
load("committees.rda")
load("contributions.rda")

Verify that your data looks like this:

library(tidyverse)
glimpse(house_elections)
## Observations: 2,178
## Variables: 10
## $ fec_id         <chr> "B2CA08156", "H0AK00097", "H0AL01030", "H0AL020...
## $ state          <chr> "CA", "AK", "AL", "AL", "AL", "AL", "AL", "AR",...
## $ district       <chr> "08", "00", "01", "02", "05", "07", "07", "01",...
## $ incumbent      <chr> "FALSE", "FALSE", "FALSE", "TRUE", "TRUE", "TRU...
## $ candidate_name <chr> "Mitzelfelt, Brad", "Cox, John R.", "Gounares, ...
## $ party          <chr> "R", "R", "R", "R", "R", "D", "R", "R", "R", "R...
## $ primary_votes  <int> 8801, 11179, 3854, 0, 65163, 0, 11537, 0, 0, 0,...
## $ runoff_votes   <int> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,...
## $ general_votes  <int> 0, 0, 0, 180591, 189185, 232520, 73835, 138800,...
## $ ge_winner      <chr> "", "", "", "W", "W", "W", "N", "W", "W", "W", ...
glimpse(candidates)
## Observations: 5,628
## Variables: 15
## $ cand_id                <chr> "H0AK00089", "H0AK00097", "H0AL00016", ...
## $ cand_name              <chr> "CRAWFORD, HARRY T JR", "COX, JOHN ROBE...
## $ cand_party_affiliation <chr> "DEM", "REP", "UNK", "REP", "REP", "DEM...
## $ cand_election_yr       <int> 2010, 2012, 2010, 2012, 2012, 2008, 201...
## $ cand_office_state      <chr> "AK", "AK", "AL", "AL", "AL", "AL", "AL...
## $ cand_office            <chr> "H", "H", "H", "H", "H", "H", "H", "H",...
## $ cand_office_district   <int> 0, 0, 7, 1, 2, 5, 5, 5, 5, 5, 6, 7, 7, ...
## $ cand_ici               <chr> "C", "C", "O", "C", "I", "C", "C", "I",...
## $ cand_status            <chr> "P", "N", "C", "C", "C", "C", "C", "C",...
## $ cand_pcc               <chr> "C00466698", "C00525261", "C00464040", ...
## $ cand_st1               <chr> "4350 BUTTE CIR", "PO BOX 1092", "PO BO...
## $ cand_st2               <chr> "", "", "", "", "", "", "", "", "SUITE ...
## $ cand_city              <chr> "ANCHORAGE", "ANCHOR POINT", "BIRMINGHA...
## $ cand_state             <chr> "AK", "AK", "AL", "AL", "AL", "AL", "AL...
## $ cand_zip               <int> 99504, 8388607, 35201, 36561, 36106, 35...
glimpse(committees)
## Observations: 14,454
## Variables: 15
## $ cmte_id                <chr> "C00000042", "C00000059", "C00000422", ...
## $ cmte_name              <chr> "ILLINOIS TOOL WORKS INC. FOR BETTER GO...
## $ tres_name              <chr> "LYNCH, MICHAEL J. MR.", "GREG SWARENS"...
## $ cmte_st1               <chr> "3600 WEST LAKE AVENUE", "2501 MCGEE", ...
## $ cmte_st2               <chr> "", "MD#288", "SUITE 600", "", "", "", ...
## $ cmte_city              <chr> "GLENVIEW", "KANSAS CITY", "WASHINGTON"...
## $ cmte_state             <chr> "IL", "MO", "DC", "OK", "KS", "IN", "DC...
## $ cmte_zip               <int> 60026, 64108, 20001, 73107, 66612, 4620...
## $ cmte_dsgn              <chr> "B", "U", "B", "U", "U", "U", "B", "B",...
## $ cmte_type              <chr> "Q", "Q", "Q", "N", "Q", "Q", "Q", "Q",...
## $ cmte_party_affiliation <chr> "", "UNK", "", "", "UNK", "", "UNK", "U...
## $ cmte_filing_freq       <chr> "Q", "M", "M", "Q", "Q", "Q", "M", "M",...
## $ org_type               <chr> "C", "C", "M", "L", "T", "M", "M", "L",...
## $ connected_org_name     <chr> "ILLINOIS TOOL WORKS INC.", "", "AMERIC...
## $ cand_id                <chr> "", "", "", "", "", "", "", "", "", "",...
glimpse(contributions)
## Observations: 396,369
## Variables: 22
## $ cmte_id          <chr> "C00478404", "C00140855", "C00140855", "C0014...
## $ amndt_ind        <chr> "N", "N", "N", "N", "N", "N", "N", "N", "N", ...
## $ rpt_type         <chr> "M3", "M3", "M3", "M3", "M3", "M3", "M3", "M3...
## $ transaction_pgi  <chr> "P", "P", "P", "P", "P", "P", "G", "P", "P", ...
## $ image_num        <chr> "11930476751.0", "11930476826.0", "1193047682...
## $ transaction_type <chr> "24K", "24K", "24K", "24K", "24K", "24K", "24...
## $ entity_type      <chr> "COM", "CCM", "CCM", "CCM", "CCM", "CCM", "CC...
## $ name             <chr> "KLINE FOR CONGRESS", "TIM RYAN FOR U.S. CONG...
## $ city             <chr> "BURNSVILLE", "WASHINGTON", "WASHINGTON", "BO...
## $ state            <chr> "MN", "DC", "DC", "MD", "ND", "MI", "MN", "IA...
## $ zip_code         <chr> "55337", "20013", "20005", "20716", "58106", ...
## $ employer         <chr> "", "", "", "", "", "", "", "", "", "", "", "...
## $ occupation       <chr> "", "", "", "", "", "", "", "", "", "", "", "...
## $ transaction_dt   <chr> "02252011", "02012011", "02012011", "02222011...
## $ transaction_amt  <dbl> 2400, 1000, 1000, 2500, 1000, 5000, 1000, 100...
## $ other_id         <chr> "C00326629", "C00373464", "C00289983", "C0014...
## $ cand_id          <chr> "H8MN06047", "H2OH17109", "H4KY01040", "H2MD0...
## $ tran_id          <chr> "B37FBC79414E54DD7A1C", "38595006", "38595007...
## $ file_num         <int> 717033, 717042, 717042, 717042, 717043, 71704...
## $ memo_cd          <chr> "", "", "", "", "", "", "", "", "", "X", "", ...
## $ memo_text        <chr> "", "", "", "", "", "", "", "", "", "CHECK 23...
## $ sub_id           <dbl> 4.03182e+18, 4.03172e+18, 4.03172e+18, 4.0317...

Make sure that the row and column counts match!