Closed MansMeg closed 4 months ago
Do you want to make a unit test well before we can make the unit test pass? Code exists for this in script form...
An update on the MP frequency issue: I revised code to handle dates in different formats better. In essence:
if start <= day < end: then MP in parliament on day
if startyear <= dayyear <= endyear
: then MP in parliamentThe results are closer to what we want: 98.03% within the 10% tolerance. In this case, unlike the last time around, those parliament days that fail this test fail because there are too many MPs. I'm not sure yet how to evaluate which of these strategies is closer to the truth of our coverage, but if we trust the manual work that has been done with the bio books and Wikidata, I suppose (and hope) this most recent iteration is better than the last.
Failing parliaments (in decreasing severity) are
These earlier years tend to be the ones where we have less specific info on MPs mandate period -- @fredrik1984 @Lottabrorson, do either of you know off the top of your heads if there was a lot of turnover betewen a and b or lagtima and urtima? That might explain some of the overages.
I think we should try this again when we settle on a list of 'normal' start and end dates for parliament. #356
Ok, this looks like an improvement! And I am sure the graph will look even better after we have done the manual work with the bio books and Wikidata! Great work @BobBorges!
Yes. I guess we can wait with this until @fredrik1984 and the orhers are done with their pass through the biobooks?
Yes, let's wait for that. I will probably be done with bio book 2 later this week. It might take some more week or so for Mattias (bio book 3–4) and @Lottabrorsson (bio book 5). Going through bio book 1–2, I have added several specific dates so this work will most likely improve the MPs/protocol graph.
Also, as we go through the bio books, we also look up MPs that have more than one party belonging and see if these parties are added on Wikidata #359. I must say that Sälgö has done a good job in adding parties to MPs in the 19th century! Hence, doing this and fixing the list of MPs with no parties (#349) will improve the MP database a lot!
I think #355 is relevant here as well.
We discussed how to setup a unit test for MP quality that is good enough. We ended up in testing the following.
For all dates available in the corpus (protocol dates) we check that there are at least 90% of the true number of MPs in the database.
This should be implemented as a unit test, and then we will try to focus on where this is not the case.