Open chrisdick14 opened 7 years ago
Attached: a file containing the availability of granular data by state - not necessarily district. Arizona data - not cleaned
Awesome work! We will have to start assigning people to each of the states we can pull.
I'm interested to jump in and help. I was thinking I could start with Texas, since that's where I am. But I'm happy to take on some other states too. That being said, I had some questions...
Hi,
So I'm not really sure where would be the best place to put these but I have for varying recent years (farthest back ~2008) voter files for CO, CT, DC, DE, FL, GA, MI, NC, OK, RI, UT, and WA. Would that be helpful?
You can find the current NC registered voter info here https://data.world/kflanagan/nc-statewide-voter-info Along with it is the SQL statement to create columns
@KirkHadley and @kflanagan we can definitely use this information. However, this is slightly different data than we have been using in the past so let me think about where we want to store it, and how it will fit into our current structure.
@chrisdick14 I actually have that file for every NC election since 2005. Should I upload it to data.world? @kflanagan Has any thought been put into standardizing election results at the state level? If so, I have all the states state level election results at the district level and am more than happy to share.
@KirkHadley and @chrisdick14 The source for the data I posted is the state, here's their link. I don't know if there are efforts to standardize but given the sate of things at the federal level I doubt it. https://s3.amazonaws.com/dl.ncsbe.gov/data/ncvoter_Statewide.zip
@KirkHadley and @kflanagan there are two things we can do for these data. (1) You can post them yourself on data.world and tag them with 'd4d' and 'election transparency' (as well as any other tags you want to use), or (2) we can have you send us the data and we can upload directly to the d4d election transparency data.world page. I am totally fine either way. I agree about the standardization. The Open Elections Project has been doing some of this work: https://github.com/openelections/openelections-results-nc
I think one thing we could do is if we can get results from several states we can all agree on a format moving forward and put something out there, if that is something you all are interested in.
Given that I had already put the NC data up on data.world I just went and tagged them with d4d and election transparency. That'll get us started. I don't know what's best, the states keep their own formats, is it a good use of time to re-format every time they update the data? I think that NC updates weekly. Would use of data.world to present the data via SQL like queries be something that we could do to present it in a way that would allow folks to query across states?
@kflanagan I think that is a great idea. Especially with data that are coming out that regularly. I think if there were some 'clean' datasets we needed for projects we could pull the requisite data from your larger file and post it in the cleaned format that we end up using for analysis.
This is really fantastic. We are having a hackathon this weekend and who knows, someone may end up using these data in their analyses!
I found a flaw in my logic. Big data sets don't work so well it seems on data.world, file too large to extract from the archive. Maybe I'll try to upload the raw data, but of course the uncompressed file may be too big to upload raw. Perhaps we need to point at the county by county info for NC. I'll take a look at it this evening.
Let me know how big the data set would be. We can chat with the data.world folks and see if there is a work around. If not we may have some other options that I am exploring now to upload the data and make it public.
So I have voter files on a good number of states (I'm a squirrel with these things). Details on sizes and such:
@KirkHadley is that the voter file that's found https://s3.amazonaws.com/dl.ncsbe.gov/data/ncvoter_Statewide.zip but with multiple years?
Ok, those are going to be too big for data.world I think. We are going to have to come up with another solution to host these. Let me do some asking around and see what we can find.
Hi, I'm Edward. I'm new and happy to help. To get rolling I scraped the relevant PDFs off of the DC BoE site in the link above to see how hard the PDFs are to parse. The answer is (predictably) not terribly easy, but possible.
Given that, what data do we want?
I also saw on their website that you can get the whole voter file on CD-ROM (yeah) for $2 (yeah). It's not clear if how it handles formerly registered voters, but it's as granular as you can get—but since it's individuals, it's at least dubious to republish it unaggregated, even though it's all public data. I'm not sure we want it, but it's entirely possible to assemble a national voterfile; e.g. you can grab the Ohio CD CSVs at will.
Would like a time-series for redistricting work.