BrandNewCongress / welcome

The definitive guide to working on the Brand New Congress tech team
37 stars 8 forks source link

Collect congressional data and create an interface to interact with it #1

Closed saikat closed 7 years ago

saikat commented 8 years ago

Ryan Hutto had started on this. The proposal is a bit vague, but we want to start collecting our important congressional data to figure out how our research team will be working.

dcampana commented 8 years ago

I have setup the research team with a spreadsheet for the preliminary research on key issues but we need some sort of system that allows for more robust answers besides Support/Doesn't Support, especially for things that can't be clearly defined or need more detail. I thought maybe a wiki would work but that might also get unwieldy with 600 different pages for such a relatively small amount of information.

saikat commented 8 years ago

My inclination here is that, since the scope of this is still a little vague (e.g. we don't have a clear sense of what we are tracking and where existing systems will break down for us), we should try to use an off the shelf solution until it becomes unwieldy. I want to make sure that if we do build something here, that it is the right thing. We should also figure out what the issues are with the existing solutions that we would need to solve (e.g. the 600 pages thing with a wiki isn't necessarily a problem inherent to wikis -- it's a problem inherent to us having lots of data. Maybe the issue then is really that wikis don't have great search capability?)

Some possible solutions:

I can ask a couple of my friends who have done a lot of political research what system they use to keep track of the data they are collecting, in case that helps!

On Thu, May 26, 2016 at 7:43 PM dcampana notifications@github.com wrote:

I have setup the research team with a spreadsheet for the preliminary research on key issues but we need some sort of system that allows for more robust answers besides Support/Doesn't Support, especially for things that can't be clearly defined or need more detail. I thought maybe a wiki would work but that might also get unwieldy with 600 different pages for such a relatively small amount of information.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/BrandNewCongress/welcome/issues/1#issuecomment-222025419, or mute the thread https://github.com/notifications/unsubscribe/AAFHRii_aTeJZmFDEZqAzaJEYeEBN4m9ks5qFjA3gaJpZM4Ik5jT .

dcampana commented 8 years ago

We will be at the point that we need a more robust system in the coming month or so. Right now we are using this spreadsheet and I feel its really limited with both access management, data logging, and data entry. As this fills up we are going to need a more robust system to describe the nuances of issues marked unclear.

Here are the features I would like:

I think a wiki is the best approach unless we can make a custom tailored piece of software but that seems overkill for this. Since its a google doc we can just scrape the relevant data and upload it to the wiki.

saikat commented 8 years ago

Yup -- I would say let's go the wiki route until that doesn't work for us. It'll also give us a better sense of where the difficulties are in using a wiki to know what we should focus on if we make some custom software. Let's talk on the phone to talk through setting it up with Nina? I'm on slack as @saikat.

On Fri, Jun 3, 2016 at 8:58 PM dcampana notifications@github.com wrote:

We will be at the point that we need a more robust system in the coming month or so. Right now we are using this spreadsheet https://docs.google.com/spreadsheets/d/12m_w1B9Ww0kUiKnD4hDGYeN-DoHAuVLTqc1_nBdyfGY/edit#gid=0 and I feel its really limited with both access management, data logging, and data entry. As this fills up we are going to need a more robust system to describe the nuances of issues marked unclear.

Here are the features I would like:

  • Access management to prevent people from maliciously removing or editing data
  • Keyword sorting so we can sort by things such as Scandal, Corruption, Etc.
  • Ability to easily add more politicians for when we need to add people who are running against the existing candidates.

I think a wiki is the best approach unless we can make a custom tailored piece of software but that seems overkill for this. Since its a google doc we can just scrape the relevant data and upload it to the wiki.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/BrandNewCongress/welcome/issues/1#issuecomment-223726493, or mute the thread https://github.com/notifications/unsubscribe/AAFHRoCdNZ0xEBWO6CKu3rC8x2Hl7ZNyks5qIM2-gaJpZM4Ik5jT .

AlaskanOracle commented 8 years ago

Drupal might be a solution (http://www.drupal.org) also ties in well with CiviCRM (http://www.civicrm,org) and you can add wiki like features to it . :) also with its fieldable content and Views there is a lot of potential.

saikat commented 8 years ago

We use Nationbuilder more as our CRM than our CMS.

On Thu, Jun 9, 2016 at 7:45 PM AlaskanOracle notifications@github.com wrote:

Drupal might be a solution (http://www.drupal.org) also ties in well with CiviCRM (http://www.civicrm,org) and you can add wiki like features to it . :) also with its fieldable content and Views there is a lot of potential.

— You are receiving this because you were mentioned.

Reply to this email directly, view it on GitHub https://github.com/BrandNewCongress/welcome/issues/1#issuecomment-225058934, or mute the thread https://github.com/notifications/unsubscribe/AAFHRrtt3w3UFXLNZ5TfkdGdILzZOWanks5qKKWUgaJpZM4Ik5jT .

kevin-vilbig commented 8 years ago

Malicious editing isn't an issue as long as we keep track of the edits within the system and can rollback. Wikis can do this. Google docs can also do this.

But...

As long as we organize our spreadsheet in such a way that it is explicitly tabular, rather than having multiple tables, it will be simple to iterate over it with standard stats analytics tools, including tagging for keywords! Maybe, since you are already splitting things up to multiple tables because of technical issues (google docs rendering too slowly on large tables) on that spreadsheet you posted, we should think about another solution? Nationbuilder has an API. It will take some effort to learn it, but maybe that will be best, rather than building a bespoke DB system.

Spreadsheets can only go so far before they become a liability rather than a benefit.

saikat commented 8 years ago

Google Sheets is able to handle tens of thousands of rows pretty easily. I'd start with sheets just so we can play around with different ideas and models first before sticking it into a database. Once we start making SQL models, it'll be harder to be as flexible.

On Thu, Jun 9, 2016 at 8:52 PM Kevin Foobar notifications@github.com wrote:

Malicious editing isn't an issue as long as we keep track of the edits within the system and can rollback. Wikis can do this. Google docs can so this.

But...

As long as we organize our spreadsheet in such a way that it is explicitly tabular, rather than having multiple tables, it will be simple to iterate over it with standard stats analytics tools, including tagging for keywords! Maybe, since you are already splitting things up to multiple tables because of technical issues (google docs rendering too slowly on large tables) on that spreadsheet you posted, we should think about some kind of SQL database?

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/BrandNewCongress/welcome/issues/1#issuecomment-225068025, or mute the thread https://github.com/notifications/unsubscribe/AAFHRmrBDJw9MChGUL8y0lEdFo4qL7bPks5qKLVTgaJpZM4Ik5jT .

saikat commented 8 years ago

@dcampana I think what might make sense here (and not sure if this is something you are already working on) is to create something of a longish-term plan for the research team of what you will be collecting and how. This will probably need to be in conjunction with the larger plan that Brand New Congress needs to make for the group. It'll help us plan a bit on the tech needs.

sgrimsley commented 8 years ago

I was looking at the research team application, and found some of the data can be acquired in bulk through the FEC or GovTrack.us. (Candidate info, financial data, votes, committee membership.)

I know how I'd approach the data collection / interface issue for me, but am not sure if that'd scale for BNC.