jacquerie / senato.py

A scraper for the data made available by the Italian Senate, and a cluster analysis to detect similar amendments.
MIT License
121 stars 1 forks source link

[DISCUSSION] Going forward: where to go with this? #2

Open jacquerie opened 8 years ago

jacquerie commented 8 years ago

Thanks everyone who starred/watched/forked! I hope that this is the basis of a small community focused on applying their coding skills to the public good. Now that the media storm has begun to subside I finally have a little more time to put into this project : )

  1. As the name senato.py suggests, my original aim was to build a SPARQL wrapper/crawler-when-needed to fetch data from the Senate's website, in order to run analysis such as the one that you saw. Now, there's already some previous work on this topic by @verganis: https://github.com/verganis/parlamento_fetch. I'm not sure if this is what OpenParlamento is using right now, but whatever they are running might have some bugs: for example, on DDL S.2081 they report only 3304 amendments, roughly half of the actual amount. I'd love to open a discussion with them in order to build/improve a tool they can reuse. Is anybody here in contact with them?
  2. Back to the analysis part, I'd love to brainstorm about possible future topics. There's already some work in this space, for example these R scripts/visualizations by @briatte: https://github.com/briatte/parlamento, and I'm sure there's much more to be (re)discovered. Please comment here with ideas, your unfinished project or finished projects that you think could be improved.

Infine dichiaro che sono benissimo accetti commenti in italiano, inglese o qualunque combinazione delle due lingue : )

thebabush commented 8 years ago

So this is my quick'n'dirty graph viz about party switching in the Camera dei Deputati: http://kenoph.github.io/silly-potty/

I also have a Senato version in my repository but I didn't publish it on GitHub Pages because it's missing some features.

As a final note, I would like to say that the sparql end-point of Senato is running a Sparql 1.0 implementation, while the Camera is using a Sparql 1.1 implementation which makes it easier to write complex queries by using aggregate functions.