welfare-state-analytics / riksdagen-corpus

Swedish parliamentary proceedings - Riksdagens protokoll 1867-today
Other
26 stars 5 forks source link

Adding a metadata file: start-end dates of riksdag sessions #356

Closed BobBorges closed 9 months ago

BobBorges commented 12 months ago

The file was curated from the Wikipedia article Lista över svenska riksdagar. Columns are

Columns:

We know this list from Wikipedia isn't perfect, so it's open here as a PR for full scrutiny.

For example, @Lottabrorsson wrote in email:

""" The change for the Riksmöte to being the whole year, autumn to autumn without a break during the summer, took place already in 1995. Attached is a comment regarding Riksdagsordningen, that this concerns.

And for example, year 1998 – the last meeting before the summer was June 25, not June 10, although now it is wrong to say that it ended before the summer.

And I think that is because the last protokoll on the web before the summer is from June 10. (And that means that the protocol 1997/98:123 is missing.) """ RO_kap1_kommentar.pdf

I also compared this list to the actual dates indicated in protocol metadata, which reveals, e.g. that the 201718 parliament year probably has an incorrect end date. I'm attaching a json file with the results of this comparison (again Github doesn't like json files, so just remove the .txt. extension), which is sorted by the number of protocols that have a docDate in the metadata outside the range of dates in the proposed metadata list.

oor.json.txt

Note that a date out of range isn't necessarily a problem with the list. Example: prot-1914-b-ak--27.xml, in addition to the actual date of the document, our find_dates.py script picked up an out of range date that was quoted in the protocol and added it to the protocol metadata. In this case, the out of range date indicates a potential problem with the protocol rather than the list of dates. (Do we want to remove such dates?)

riksdag_start-end_quoted-date

riksdag_start-end_quoted-date-prot

fredrik1984 commented 12 months ago

Ok, great! I suggest that if you @Lottabrorsson has any questions about this work you can ask Bob

BobBorges commented 12 months ago

342

MansMeg commented 12 months ago

Nice, I would order the file according to the dates, i.e. the latest Riksdag as the first value. Then it will be easier to check the file as well.

MansMeg commented 11 months ago

We are here waiting for @Lottabrorsson on checking that the dates in the file is correct.

fredrik1984 commented 11 months ago

@Lottabrorsson I found this table in Stjernquist 1966 vol. 4 – all start/end dates for riksdag meeting years between 1933 and 1965

Unknown

fredrik1984 commented 10 months ago

@Lottabrorsson has done a great list of start/end dates of all riksmöten since 1867. See the list here:

Riksmöten_def.xlsx

BobBorges commented 10 months ago

Excellent, and thanks for posting the file here @fredrik1984. I'll compare Lotta's work with what we came up with from the protocols and give some signal here when it's time to merge.

fredrik1984 commented 10 months ago

288 and probably related to #254 too

BobBorges commented 10 months ago

I've been looking carefully at @Lottabrorsson's file, and there are some points to discuss:

Lottabrorsson commented 10 months ago

I've been looking carefully at @Lottabrorsson's file, and there are some points to discuss:

  • There are differences between the two lists. Most of them are a matter of a day or two, but others are more severe -- e.g. 2006--2009 months are off, or 1947, half a year. Should we double check these? lotta-wiki_discrepancies.csv
  • in our protocols and the wiki list we have a 1980 urtima session and 1919 lagtima/urtima, but it's not on Lotta's list
  • Just eyballing it, Lotta's dates seem more restrictive (start later, end earlier), which has consequence if we test all protocols against the lists -- of 18,000 dates 820 are out of range according to Lotta, 683 out of range according to wikipedia (N.b.: there are valid reasons for a date to be out of range, but we might want to look into some of them if that 3--5% is concerning) prot-oor_cf-lotta.json prot-oor_cf-wiki.json

Sorry @BobBorges! I am just now double checking some of the dates. I'll get back to you with some changes. Some times it is differents dates for the two chambers. And some times Stjernquist has a different day compered to the protocols. It´s not easy :)

BobBorges commented 10 months ago

Sorry @BobBorges!

No worries! I know it's tricky :)

Lottabrorsson commented 10 months ago

Sorry @BobBorges!

No worries! I know it's tricky :)

I´ll soon send you an email about this and a new list.

fredrik1984 commented 9 months ago

This also relates to #416

BobBorges commented 9 months ago

Change all rows "fk-ak" into two rows, one with "fk" and one with "ak"

fixed in c0f7a4b

MansMeg commented 9 months ago

Looks like unix dates:

parliament_year,specifier,chamber,start,end
1867,,fk,-12037,-11916
1867,,ak,-12037,-11916
BobBorges commented 9 months ago

Looks like unix dates:

parliament_year,specifier,chamber,start,end
1867,,fk,-12037,-11916
1867,,ak,-12037,-11916

c725272

MansMeg commented 9 months ago

Is there anywhere we can state that this dataset has been created by experts (Lotta)?

BobBorges commented 9 months ago

readme?

MansMeg commented 9 months ago

Maybe add a source column with a reference and then adda ref in the bibtex?

MansMeg commented 9 months ago

Ull add this as a separate issue, so Im happy now.