Remove strange wikidata punctuation on location specifiers

MansMeg commented 1 year ago

See for example: Q5885293,"Kråkered," (now fixed)

We should not include location specifiers with punctuation if the ordinary name exist (like Kråkered in this case).

see: https://github.com/welfare-state-analytics/riksdagen-corpus/blob/main/corpus/metadata/location_specifier.csv

ninpnin commented 1 year ago

This is a problem upstream https://www.wikidata.org/wiki/Q5885293

MansMeg commented 1 year ago

Yes. The question is how to solve this. I guess we would like to remove stuff in our corpus but that people might want to keep in wikidata, so that there will not be a perfect alignment with wikidata. Maybe add a csv with stuff we exclude from wikidata we add to the updating script from wikidata? Or do you have another solution?

ninpnin commented 1 year ago

I mean those misspellings could be just fixed on wikidata?

EDIT: AFAIK those additional commas don't introduce any errors to our corpus

MansMeg commented 1 year ago

No. I know. My point is that sooner or later we might end up with differences. But maybe not in the next couple of moths. Then fixing this in wikidata is probably easiest.

BobBorges commented 1 year ago

They need to be edited on wikidata:

[x] Q116687501,"Gärde, Malcolm"
[ ] Q117011372,"Falla, Ernst"
[ ] Q117039047,"Gärestad,Hans Alfred Petersson"
[ ] Q117280842,"Mora-Noret, Nils J, 5:79"
[ ] Q117288109,"Stockholm, Per Oskar Samuelson"
[ ] Q117288989,"Göteborg, Gustav Harald Svensson"
[ ] Q15956417,"Stockholm, Gustaf, 1:152"
[ ] Q3373681,"Äppelviken, Axel"
[x] Q459226,"Stockholm, Anita I, 1:91"
[ ] Q4730705,"Alversjö, Allan F A, 2:160"
[ ] Q4795536,"Lekåsa, J Aron, 4:321"
[ ] Q4955783,"Sundbyberg, Margo"
[ ] Q5712545,"Stockholm,"
[ ] Q5715803,"Myckelgård, Gustaf"
[ ] Q5721610,"Stockholm, Knut G"
[ ] Q5723194,"Multrå, Johan, 5:219"
[ ] Q5724038,"Jönköping, Carl B N"
[ ] Q5747977,"Stockholm, Thorvald, 1:90"
[ ] Q5768473,"Harads, Johan Erik"
[ ] Q5768483,"Gränna, Johan Gustaf"
[ ] Q5770994,"Skövde, J M Alfred"
[ ] Q5777495,"Sjögesta senare Örebro, Anders P, 4:486"
[ ] Q5780366,"Stockholm,"
[ ] Q5785596,"Härnösand,"
[ ] Q5786130,"Blomberg,"
[ ] Q5789110,"Gårda, Gustav W"
[ ] Q5795578,"Riseberga,"
[ ] Q5795659,"Stjärnebo, F A Hugo, 2:73"
[ ] Q5854947,"Visby, C Suno H"
[ ] Q5885438,"Kalmar, J August"
[ ] Q5928617,"Kyrkdal senare Sollefteå, E Harald, 5:235"
[ ] Q5942265,"Ystad,"
[ ] Q5951819,"Göteborg,"
[ ] Q5961317,"Tjörn, Axel V, 4:118"
[ ] Q6001576,"Stockholm, Carl Göran D"
[ ] Q6011317,"Stävie, Nils"
[ ] Q6026693,"Rögle,"
[ ] Q6026862,"Kullenbergstorp, Gillis O T C , 3:252"
[ ] Q6027237,"Kvarnbrodda, Jöns"
[ ] Q6031148,"Anderstorp, C E Holge, 2:174"
[ ] Q6044908,"Hasselstad, August, 3:71"
[ ] Q6045550,"Ugglekull, Peter"
[ ] Q6062302,"Övedskloster, Otto A P, 3:265"
[ ] Q6139775,"Öckerö,"
[ ] Q6157386,"Hammerdal, Johan"
[ ] Q6161169,"Hofors, H Hjalmar, 5:167"
[ ] Q6195438,"Växjö, S A Gustaf, 2:238"
[ ] Q6199292,"Stockholm, David C, 1:177"
[ ] Q6199894,"Örebro, G Ruben"
[ ] Q6255608,"Stångby, Jöns, 3:293"
[ ] Q6298643,"Gäre, Carl"

MansMeg commented 1 year ago

Ping @salgo60 . Is this something you could take a pass on?

salgo60 commented 1 year ago

[x] Q116687501,"Gärde, Malcolm"
[x] Q117011372,"Falla, Ernst"
[x] Q117039047,"Gärestad,Hans Alfred Petersson"
[x] Q117280842,"Mora-Noret, Nils J, 5:79"
[x] Q117288109,"Stockholm, Per Oskar Samuelson"
[x] Q117288989,"Göteborg, Gustav Harald Svensson"
[x] Q15956417,"Stockholm, Gustaf, 1:152"
[x] Q3373681,"Äppelviken, Axel"
[x] Q459226,"Stockholm, Anita I, 1:91"
[x] Q4730705,"Alversjö, Allan F A, 2:160"
[x] Q4795536,"Lekåsa, J Aron, 4:321"
[x] Q4955783,"Sundbyberg, Margo"
[x] Q5712545,"Stockholm,"
[x] Q5715803,"Myckelgård, Gustaf"
[x] Q5721610,"Stockholm, Knut G"
[x] Q5723194,"Multrå, Johan, 5:219"
[x] Q5724038,"Jönköping, Carl B N"
[x] Q5747977,"Stockholm, Thorvald, 1:90"
[x] Q5768473,"Harads, Johan Erik"
[x] Q5768483,"Gränna, Johan Gustaf"
[x] Q5770994,"Skövde, J M Alfred"
[x] Q5777495,"Sjögesta senare Örebro, Anders P, 4:486"
[x] Q5780366,"Stockholm,"
[x] Q5785596,"Härnösand,"
[x] Q5786130,"Blomberg,"
[x] Q5789110,"Gårda, Gustav W"
[x] Q5795578,"Riseberga,"
[x] Q5795659,"Stjärnebo, F A Hugo, 2:73"
[x] Q5854947,"Visby, C Suno H"
[x] Q5885438,"Kalmar, J August"
[x] Q5928617,"Kyrkdal senare Sollefteå, E Harald, 5:235"
[x] Q5942265,"Ystad,"
[x] Q5951819,"Göteborg,"
[x] Q5961317,"Tjörn, Axel V, 4:118"
[x] Q6001576,"Stockholm, Carl Göran D"
[x] Q6011317,"Stävie, Nils"
[x] Q6026693,"Rögle,"
[x] Q6026862,"Kullenbergstorp, Gillis O T C , 3:252"
[x] Q6027237,"Kvarnbrodda, Jöns"
[x] Q6031148,"Anderstorp, C E Holge, 2:174"
[x] Q6044908,"Hasselstad, August, 3:71"
[x] Q6045550,"Ugglekull, Peter"
[x] Q6062302,"Övedskloster, Otto A P, 3:265"
[x] Q6139775,"Öckerö,"
[x] Q6157386,"Hammerdal, Johan"
[x] Q6161169,"Hofors, H Hjalmar, 5:167"
[x] Q6195438,"Växjö, S A Gustaf, 2:238"
[x] Q6199292,"Stockholm, David C, 1:177"
[x] Q6199894,"Örebro, G Ruben"
[x] Q6255608,"Stångby, Jöns, 3:293"
[x] Q6298643,"Gäre, Carl"

salgo60 commented 1 year ago

@MansMeg what problem did you find with Q117288109

Q117288109#P2561

My changelog Special:Contributions/Salgo60

MansMeg commented 1 year ago

I think that one is actually a problem with us grabbing the data. Here we use the alias that is incorrect. @BobBorges , right?

salgo60 commented 1 year ago

All checked not all changed as I didnt see a problem...

My changelog Special:Contributions/Salgo60
over and out now I will go and sleep in my hammock for some days ;-)

Off topic I mentioned your project today as a pattern how other organizations should work with its metadata

BobBorges commented 10 months ago

Should be fixed now. If we find this as an issue again, we could write a unit test. Caused by trailing commas (removed on wikidata) and alias/i-ort in the format surname-iort, firstname. Fixed on wikidata.

welfare-state-analytics / riksdagen-corpus

Remove strange wikidata punctuation on location specifiers #220