welfare-state-analytics / riksdagen-corpus

Swedish parliamentary proceedings - Riksdagens protokoll 1867-today
Other
26 stars 5 forks source link

WIkidata same as #95

Closed salgo60 closed 1 year ago

salgo60 commented 2 years ago

I am doing a try matching corpus/members_of_parliament.csv with Wikidata

When I use Open Refine and looks in the file corpus/members_of_parliament.csv my understanding is that you have one item per person and parlament period

Question A: Do you have a file where you have "merged" those items and have one id per person e.g the below id:s looks like same as Wikidata Q5573455 sv:Wikipedia Karl_Bergström

Question B: is your modeling documented? In Wikidata we can see challenges when a person is changing party etc...

image

MansMeg commented 2 years ago

Hi!

We have an issue on converting this data to a better form (normalized). Although it has not been the main priority yet. I think we are quite open to findibg a good format for this. Now it is just a csv-file.

Although, Im no expert on what would be the best format.

What is important with the format is (as I see it)

  1. Having an individual id per person
  2. Enable additional meta data, such as gender, party etc, but that this can change over time
  3. That we have different metadata for different time periods.
  4. A way to judhe the quality of different meta data.
salgo60 commented 2 years ago

Ok we can speak about that when we do the demo of Wikidata- I will try to walk through your list and see that every line in your data is matched to a person in Wikidata i.e. it will have a Wikidata Q number

The good thing if we get external identififiers owl:same as in your data --> so that your data is on level 5 in 5stardata.info

if I do a SPARQL in Wikidata of people with position held (P39) as member of the Swedish Riksdag Q10655178 showing external properties --> SPARQL -->

image

we can see that (can be errors in Wikidata)

I guess you could structure your data as Litteraturbanken has done in the API

image

My understanding is that they take all the data from Wikidata -->

  1. The Wikidata Qnumber in "wikidata_id" = Q7724
  2. "sbl_link" is what we call P3217 and for Strindberg is 34518 i.e. the key to SBL "Riksarkivet Svenskt Biografiskt Lexikon"

Example search of people in the Swedish parlament on Wikidata 1) with SBL P3217 query (same query but link sv:Wikipedia ) 2) with P8388 "Swedish Parliament person GUID" query image

Lets talk more!!!

salgo60 commented 2 years ago

FYI: I created a repository for the matching activity see salgo60/Wikidata_riksdagen-corpus (will take 1-2 weeks)

salgo60 commented 2 years ago

Wikidata related

Match with Wikidata

Question What data would you like from Wikidata

BobBorges commented 1 year ago

Implemented? pls close

MansMeg commented 1 year ago

Yes. This can be closed now.