mysociety / pombola

GNU Affero General Public License v3.0
65 stars 41 forks source link

ZA: A method to import new constituency data is needed #1311

Open geoffkilpin opened 10 years ago

geoffkilpin commented 10 years ago

We currently expect to receive new constituency lists from parties once a quarter (this year probably sometime after elections in May), however it won't be obvious from these what details have changed. Ideally it should be possible to import the data and have changes detected and reflected on the site - including historical data.

It may also be worth examining whether the current csv format is best for importing data - perhaps a more structured json format is preferred and more processing can occur offline?

mhl commented 10 years ago

Hi @geoffkilpin - I think the ideal replacement for the CSV format would be to generate the office details in Popolo JSON format, the same format that @dracos used for importing the South African politician data in the first place.

So there are various extents to which this could be taken:

  1. Producing a Popolo JSON file with equivalent data to the constituency data CSV file
  2. Updates could be made to the south-africa-popolo.json file (now rather out-of-date, I'm afraid) so that there's one source for all the people and organisation data.
  3. The PopIt admin interface (essentially a nice web-based interface for data in the Popolo data model) could be used to make updates to the data in either variant 1 or 2; perhaps by the parties themselves?
geoffkilpin commented 10 years ago

Thanks @mhl - using that format would make sense.

Regarding the different options:

  1. I think this makes the most sense based on the below:
  2. Updating the south-africa-popolo.json file might be a bit tricky. As I understand things updates to contact details, MP lists, and similar data are currently being made in the Pombola admin interface so aren't reflected in this file - I'm not sure how simple keeping the two in sync is and whether it is even necessary.
  3. I don't think we can expect parties to make the updates - getting the information from them has been tricky enough. The issue with the interface for our use is that I assume (I haven't had a chance to try it out) that this would mean changing specific membership data, but given the difficulty in determining how lists from parties have changed an import of the entire list with automatic change detection would be a simpler and more reliable approach.
mhl commented 10 years ago

@geoffkilpin Yeah, I agree it's best just to do 1 for the moment. On the point you raise with regard to 2, indeed, it's out of date at the moment - one of our priorities coming up is to fix this, by doing issue #1151