everypolitician / everypolitician-data

data for national legislatures worldwide
http://everypolitician.org/
237 stars 54 forks source link

Ecuador #644

Closed tmtmtmtm closed 9 years ago

tmtmtmtm commented 9 years ago

http://www.asambleanacional.gob.ec/es/pleno-asambleistas

struan commented 9 years ago

I'm looking at this.

struan commented 9 years ago

Scraper at https://morph.io/struan/ecuador_national_assembly_members

Missing gender for a few people and images for one.

The party list doesn't seem to quite match up with the wikipedia data but given the wikipedia data doesn't match up with itself I'm not going to worry too much as they all have "this page is outdated" at the top.

tmtmtmtm commented 9 years ago

Thanks @struan

I'm not sure if this is just left-over data from multiple-runs that you didn't clean out, but there are three different entries for Blanca Azucena Arguello Troya: one with an id of 145, one with an empty ID, and one with an ID of sites/all/modules/an_asambleistas/img/varias/mystery

You should also add a source line for the page for each person: it looks like the 'blog' link in the 'contacts' cell would be best for that, as it seems to be their official page.

It would be good if you could trim the leading space from the Area names too. (Most of my scrapers add a .tidy method onto String that collapses all whitespace (including  ) and then removes all leading and trailing spaces — I use it pretty much everywhere)

struan commented 9 years ago

So it seems there's something screwy with the data and they have two entries for Blanca Azucena Arguello Troya although one has more or less no data and the data it does have is identical to the other record. If I skip over that then it turns out the site only has data for 136 members but there are supposed to be 137. I can't work out if this is because they are short a member at the moment or if the bad data above is messing things up :(