mysociety / parlparse

The scraper/parser that produces data for TheyWorkForYou, PublicWhip, etc
Other
61 stars 22 forks source link

WIP bring lords data up to date #113

Closed andylolz closed 3 years ago

andylolz commented 4 years ago

Refs #112.

Still todo:

dracos commented 4 years ago

Hi @andylolz Thanks for this.

It's all a bit of a hodge-podge, as I'm sure you can see; we do have a script that looks for new lords in scripts/add-new-lords and a variety of manual scripts e.g. json-end-membership and scripts (not sure of currency) in scripts/datadotparl/ e.g. one-off-sync-lord-parties. Happy to accept a data-only update, of course, none of those scripts would overwrite any of it, but if a new script could be added or an existing one improved that could keep this more in sync, spotting death/retirements of Lords, similar to the add-new-lords one, that would be ideal. I don't know how you've generated this data update, of course :) Ties in with #41.

andylolz commented 4 years ago

Thanks Matthew!

Okay cool, sounds good. I’ll try updating existing scripts, in order to make the data update reproducible.

I’m only working on this occasionally, so it might be a while before this PR is ready for review.

andylolz commented 4 years ago

Hi @dracos,

Sorry to bother you about this again!

I took a look at some of the scripts you mentioned (really helpful, btw!) and made a start on scripts/datadotparl/one-off-sync-lord-parties before I revisit deaths and retirements (which I think I’ll add a new script for).

I have a question about the rules around amending existing memberships. It seems like current scripts are happy to add new memberships from datadotparl, but aren’t always keen to amend existing memberships. See for example: https://github.com/mysociety/parlparse/blob/a980586d41898f2d5f7442647439c8d762824abf/scripts/datadotparl/one-off-sync-lord-parties#L68-L70 Presumably this is because changes to memberships in parlparse can have adverse consequences elsewhere? Or because data from datadotparl is capricious?

Updating memberships to exactly match those on datadotparl is probably the easiest thing to do, but I am guessing there’s reason to avoid this? Any advice you can offer on this is much appreciated.

Thanks!

dracos commented 4 years ago

No bother at all, thanks for looking at this!

Not a helpful comment, really is it, thanks past me :-/ I guess this might have been due to something historic where we updated quicker than them, or for some other reason like that, but I doubt that's true nowadays and being the same as the official site is probably a better position to be in anyway, asking them to fix it if we think it's wrong.

Perhaps it was if they covered a different time period to us, given we have one membership entry per party, if you see what I mean. But if we and they both only have one membership for the entire period, I wouldn't see why we wouldn't want to match, no.

andylolz commented 4 years ago

Apologies for the delayed response!

Thanks Matthew – that’s really helpful. I’ll try and make a bit more progress on this this week.

dracos commented 4 years ago

No need to apologise :) Just pushed a few changes for some new Lords we've needed to get things parsing, sorry if that conflicts with anything here.

dracos commented 3 years ago

Hi - I have taken some of the commits from this PR, and hopefully all end dates and parties are now more in sync - it's possible to not be totally right as sync parties ignores people where our current party matches Parl's current party, which means we are currently right but might have missed historical changes, but all current/most recent entries should now match. I think we can consider this done with :)

andylolz commented 3 years ago

Brilliant – thanks so much for picking this up, @dracos!