WIkidata same as - Githubissues

salgo60 commented 2 years ago

I am doing a try matching corpus/members_of_parliament.csv with Wikidata

When I use Open Refine and looks in the file corpus/members_of_parliament.csv my understanding is that you have one item per person and parlament period

Question A: Do you have a file where you have "merged" those items and have one id per person e.g the below id:s looks like same as Wikidata Q5573455 sv:Wikipedia Karl_Bergström

Question B: is your modeling documented? In Wikidata we can see challenges when a person is changing party etc...

The ShEx we have for Riksdagsledamöter in Wikidata EntitySchema:E134
Video when this schema was created "Wikipedia Weekly Network - Entity Schemas and Shape Expressions (ShEx)"

MansMeg commented 2 years ago

Hi!

We have an issue on converting this data to a better form (normalized). Although it has not been the main priority yet. I think we are quite open to findibg a good format for this. Now it is just a csv-file.

Although, Im no expert on what would be the best format.

What is important with the format is (as I see it)

Having an individual id per person
Enable additional meta data, such as gender, party etc, but that this can change over time
That we have different metadata for different time periods.
A way to judhe the quality of different meta data.

salgo60 commented 2 years ago

Ok we can speak about that when we do the demo of Wikidata- I will try to walk through your list and see that every line in your data is matched to a person in Wikidata i.e. it will have a Wikidata Q number

The good thing if we get external identififiers owl:same as in your data --> so that your data is on level 5 in 5stardata.info

if I do a SPARQL in Wikidata of people with position held (P39) as member of the Swedish Riksdag Q10655178 showing external properties --> SPARQL -->

we can see that (can be errors in Wikidata)

P1214 "Riksdagens person-id" is a good candidate to be used by people who held a position in the Swedish Government = 1886 people
P8388 "Riksdagen person GUID" is a good candidate = 1878 people
P3217 "Svenskt Biografiskt Lexikon-ID" is a good candidate = 1002 people
P4963 "Svenskt kvinnobiografiskt lexikon" maybe = 79 people

I guess you could structure your data as Litteraturbanken has done in the API

example August Strindberg Litteraturbanken https://litteraturbanken.se/api/get_author/StrindbergA

My understanding is that they take all the data from Wikidata -->

The Wikidata Qnumber in "wikidata_id" = Q7724
"sbl_link" is what we call P3217 and for Strindberg is 34518 i.e. the key to SBL "Riksarkivet Svenskt Biografiskt Lexikon"

Example search of people in the Swedish parlament on Wikidata 1) with SBL P3217 query (same query but link sv:Wikipedia ) 2) with P8388 "Swedish Parliament person GUID" query

Lets talk more!!!

salgo60 commented 2 years ago

FYI: I created a repository for the matching activity see salgo60/Wikidata_riksdagen-corpus (will take 1-2 weeks)

salgo60 commented 2 years ago

Wikidata related

see GITHUB https://github.com/every-politician-scrapers/sweden-riksdag-api-current and last commit how this user scrape Swedish Riksdagen to find differences his last commit ** every-politician-scrapers has 152 repositories so I guess its for 152 countries he is doing it see Wikidata project space Wikidata:WikiProject_every_politician

Match with Wikidata

I did a small test 3000 lines doing a reconciliation with Wikidata and added some fields when exporting it see news and the test members_of_parliament WD 20211217.csv

Question What data would you like from Wikidata

BobBorges commented 1 year ago

Implemented? pls close

MansMeg commented 1 year ago

Yes. This can be closed now.

welfare-state-analytics / riksdagen-corpus

WIkidata same as #95