voteview / WebVoteView

Webbased rollcall vote visualization software
MIT License
35 stars 5 forks source link

Bill number not available in dtl descriptions for some congresses #338

Open adamboche opened 6 years ago

adamboche commented 6 years ago

The bill_number field comes from parsing the DTL files and automated guessing. The bill numbers don't appear in the DTLs for certain congresses, so we don't have bill numbers for those. In particular, DTLs for Congresses 99 and 100 have no bill number info at all. These are before the clerk info comes online, so this may need another source, e.g. viewing the original source documents.

library(tidyverse)

filename = 'HSall_rollcalls.csv'

rollcalls = read_csv(filename)

rollcalls %>%
    mutate(has_bill_number = !is.na(bill_number)) %>%
    group_by(congress) %>%
    summarize(count_bill_numbers = sum(has_bill_number)) %>%
    arrange(count_bill_numbers)
# A tibble: 115 x 2
   congress count_bill_numbers
      <int>              <int>
 1       99                  0
 2      100                  0
 3       14                  2
 4        8                 30
 5        4                 34
 6        3                 37
 7       13                 51
 8        6                 54
 9        2                 61
10        1                 68
# ... with 105 more rows
aaronrudkin commented 6 years ago

Ideal entry point for an external student, since this is basically going through archival documents to match information.