Data4Democracy / election-transparency

A Data4Democracy community working to make elections and elections data more transparent
89 stars 44 forks source link

California - Collect historical voter registration data #7

Closed chrisdick14 closed 7 years ago

chrisdick14 commented 7 years ago

Data: http://www.sos.ca.gov/elections/voter-registration/voter-registration-statistics

ptrbates commented 7 years ago

I'd like to get on board with this, is there a description of the task involved? What exactly does "collect" mean, and what data from the link are we interested in? How far back?

chrisdick14 commented 7 years ago

That is great! Basically, we are looking for someone to download and get the data in a format that looks like the data in the following link: https://data.world/data4democracy/election-transparency/file/PartyRegistration.csv

We are looking to collect as much data as possible into the past, but start with the most current data (other than the 2016 election) and collect at the county level. Using the 15-day report is perfect. Please let me know if you have any further questions or need any help.

ptrbates commented 7 years ago

To clarify, the most current data before November 8 at the county level. Including primaries, etc.? or just the general/presidential elections?

chrisdick14 commented 7 years ago

Let's start with the general elections. We can go back and fill in the primaries later once we get a first set going.

ptrbates commented 7 years ago

Does this look like I'm on the right track? If so, I'll keep going with other General Elections. Let me know if anything should be changed or done differently. Is .txt the right file type? 2014_general_county.txt

chrisdick14 commented 7 years ago

That looks great! You can stay with a txt if you would like, we have been tending towards .csv, but I don't think it matters much as long as you let us know what the delimiter is.

ptrbates commented 7 years ago

csv is fine! The comment box wouldn't let me attach one. I'm done with general elections back to 2002; the 2000 data is in a pdf; I'm looking into how to extract that data. How should I submit what I have? Do you have a naming preference?

chrisdick14 commented 7 years ago

Go ahead and upload and submit a pull request in the /scripts folder. No naming preference, other than having the state name and years in there, we will be joining it with a larger dataset.

chrisdick14 commented 7 years ago

Looks like it is text table based, so Tabula would probably work for the PDFs.

ptrbates commented 7 years ago

Thanks, Tabula worked well. Sorry to need some hand-holding on the next part (definitely a beginner): the scripts folder is telling me I don't have permission to upload files, and the pull request box is greyed out.

chrisdick14 commented 7 years ago

Nope, not a beginner thing. I needed to add you to our org. You should be getting an invite shortly. Sorry about that. Let me know if that won't work. If not, you may need to fork the repo, upload on your fork, then do a pull request.

ptrbates commented 7 years ago

Ok, done. Thanks for the help!

chrisdick14 commented 7 years ago

Data just need to be read into R package. Closing issue.