mysociety / theyworkforyou

Keeping tabs on the UK's parliaments and assemblies
http://www.theyworkforyou.com/
Other
229 stars 53 forks source link

speaker ID wrong after loading into the database #467

Open mhl opened 10 years ago

mhl commented 10 years ago

This is a bit mysterious to me - we had a report from a user that there's a speech that TWFY is displaying as from Johann Lamont, when in the original it's from John Lamont. This seems to be fine in our XML, but wrong in the live database, so I guess there must be a problem with xml2db, unless one of these has changed since the import. The details are below:

Obviously I could just update the speaker_id in that row, but this seems mysterious enough that it would be worth getting to the bottom of.

dracos commented 10 years ago

This is related to #468 - it said Johann in 8984.xml, but was fixed to John in 8985.xml (and 8997.xml). The problem is in how the site is/should deal with multiple XML files for the same data - 8984 is what is loaded, and thus is not the latest version, and presumably should never be loaded once 8985/8997 appeared.

MyfanwyNixon commented 10 years ago

This report from a user may be related: "In the course of doing some research on the Law Lords I spotted what I think is an attribution error in your database. Two speeches in the Lords are attributed to Lord (Brian) Hutton that I think from the context and dates must actually have been made by Lord (John) Hutton of Furness. They are the top two, from 2012 and 2013, that appear under the following search for speeches by Lord (Brian) Hutton.

http://www.theyworkforyou.com/search/?pid=13486&pop=1"

dracos commented 10 years ago

This ticket is to do with Scotland and multiple XML files for a particular day, I'm afraid. That problem is because Hansard have referred to "Lord Hutton of Furness" incorrectly as "Lord Hutton" and we can only take what they say as correct. Note they link the wrong Lord Hutton on the official site, e.g. http://www.publications.parliament.uk/pa/ld201314/ldhansrd/ldallfiles/peers/lord_hansard_1097_od.html - the mistake should be reported to Hansard.