lindsaykatz / hansard-proj

Materials for the Digitization of the Australian Parliamentary Debates (1998-2022).
0 stars 1 forks source link

Inconsistencies in unique ID field "name.id" #7

Open danielcaseyAus opened 5 months ago

danielcaseyAus commented 5 months ago

There are a small number of examples of multiple MPs being given the same name.id. I'm not sure if this is in your code or in the parlinfo data: image

Examples I've found so far:

There may be others.

RohanAlexander commented 5 months ago

@danielcaseyAus - Thank you very much. We'll address these.

RohanAlexander commented 5 months ago

@lindsaykatz - I had a look at the XML and in all these cases it looks to me like they are mistakes in the original XML, which we've then propagated.

(That said, I thought we had tests for this, so I'm surprised we didn't pick them up. All four cases are in 1998/99, so maybe there was some tweak then which means the tests missed this.)

FB6

Screenshot 2024-06-19 at 1 48 46 PM

WI4

Screenshot 2024-06-19 at 1 51 49 PM

ZD4

Screenshot 2024-06-19 at 1 56 21 PM

YU5

Screenshot 2024-06-19 at 1 36 53 PM

Let's discuss at tomorrow's meeting whether you have the bandwidth to be able to (in increasing time commitment):

  1. Fix these particular mistakes.
  2. Revisit the names tests and address any similar mistakes.
  3. Update the full testing suite and deal with any issues and update to add in 2023.
lindsaykatz commented 5 months ago

Thank you for checking these @RohanAlexander, and that sounds good for our chat tomorrow. I recall there being a lot of cases like this that we had to fix manually where name ID's were attributed to the wrong MP in the XML.

lindsaykatz commented 5 months ago

Hi @danielcaseyAus - thank you again for identifying these issues. I just wanted to let you know that we are working on fixing them (and any others that we catch in the process). We will let you know when that is done.