welfare-state-analytics / riksdagen-corpus

Swedish parliamentary proceedings - Riksdagens protokoll 1867-today
Other
26 stars 5 forks source link

Party existance #478

Open BobBorges opened 4 months ago

BobBorges commented 4 months ago

We need to formalize information on political parties.

This will help us with #375 #235 #234 #231 #191

I've taken a first crack at this based on the info that's already in the corpus metadata and wikidata, but I think it needs expert eyes @fredrik1984 (pls tag Lotta).

https://docs.google.com/spreadsheets/d/1_kFX_PMWkGxexjw2Ios6XH_OqjHWb7ITEFjSeV7bVLc/edit?usp=sharing

In the linked doc, you find

MansMeg commented 4 months ago

Was this the structure you described @ninpnin ? I think you had an idea of two csvs. One static with party (node) and then a successor file. I think that makes more sense.

BobBorges commented 4 months ago

Per the discussion we just had, this is a merged csv for convenience -- V's structure is the one that will be committed to the corpus.

Should there also be some criteria to decide whether a change is just a name change or if one party becomes another? I think it's a reasonable distinction form the data perspective, but no idea how/if that could be determined practically.

MansMeg commented 4 months ago

I think any real change (e.g. name change) should be a new node. Then, we can add information on the edges (ie the succeeded_by csv file) on whether it is a name change or not (eg a "type" column). Although this might be the next step.

salgo60 commented 4 months ago

I think any real change (e.g. name change) should be a new node. Then, we can add information on the edges (ie the succeeded_by csv file) on whether it is a name change or not (eg a "type" column). Although this might be the next step.

in wikidata we have used 2 approaches for name changes... the major problem is that wikidata doesnt have core functionality for name changes in the kernel

  1. add qualifiers for start end eg. Q10512441#P2561
image
  1. create new objects for the new names and use "said to be the same as" (P460) example bondeförbundet Q110472693 / centerpartiet Q110832 this works better when we use wikipedia <-> wikidata templates --> we get the correct names in the article but writing Sparql is a nightmare....
ljo commented 4 months ago

Just a quick note on the vildediscussion, today and last week, and its presence in the spreadsheet. We need to have the vilde information connected to each mp somewhere, since we do not want to end up with unknowns, lumped together vildes or holes in affiliations. But I will check the spreadsheet and the actual contents before I say something more on this. Maybe it is sufficient already. We should also add the QA and signoff columns to some sheets just to warm up regarding the true false was_parliament_party column addition.

fredrik1984 commented 4 months ago

Absolutely, "vilde" will be a party metadata tag. However, exactly what we will name it in the MP database is not yet decided.

MansMeg commented 4 months ago

I think we would use the standard by the Riksdag, at least in Swedish (partilös?).

salgo60 commented 4 months ago

"Vilde" from a historical perspective

You can check how "vilde" was used in early 1900 I did a SPARQL query that compared the data in wikidata based on the book Tvåkammar-riksdagen 1867–1970 and picture we have scanned in wikicommons from SPA like book "Porträttbok: Riksdagsmän 1894" see GITHUB issue Porträttbook about "vilde" #139

#title: First/Sec chamber members pictures "vilde" how they are presented in "Porträttbooks"
#defaultView:ImageGrid
SELECT DISTINCT ?SWERIK ?file ?wd ?name ?image (CONCAT(?party," ", COALESCE(?timevilde, "")) AS ?vilde) (concat("Book published ",str(year(?booktime))) AS ?bookPublished)
WITH 
{ SELECT distinct ?wd ?name ?itemDescription ?party ?timevilde ?startvilde ?endvilde ?SWERIK  WHERE
  { 
    SERVICE <https://query.wikidata.org/sparql> 
    {
      VALUES ?position { wd:Q81531912 wd:Q33071890 }
      ?wd wdt:P31 wd:Q5;
          wdt:P39 ?position.
      ?wd rdfs:label ?name. FILTER(lang(?name)="sv")
      OPTIONAL {?wd wdt:P12192 ?SWERIK}
      {
       ?wd p:P102 ?PartyWD. 
       ?PartyWD ps:P102 ?p
       OPTIONAL {?PartyWD pq:P580 ?startvilde}
       OPTIONAL {?PartyWD pq:P582 ?endvilde}
       BIND (concat(str(year(?startvilde))," - ", str(year(?endvilde))) AS ?timevilde)
       ?p rdfs:label ?party.
       FILTER(LANG(?party) ="sv").
       FILTER(CONTAINS(?party, 'vilde'))
        SERVICE wikibase:label { bd:serviceParam wikibase:language "sv,en". }
       #FILTER (?wd = wd:Q5555629)
      }
    }
  }
} AS %Wikidataitems

WHERE 
{  INCLUDE %Wikidataitems .
  ?file wdt:P180 ?wd.
  VALUES ?booksP1433 { 
                       wd:Q116445396 # 1894
                       wd:Q110380539 # 1897 
                       wd:Q110380456 # 1900
                       wd:Q110375618 # 1903
                       wd:Q110376088 # 1906 
                       wd:Q116313186 # 1909
                     }

   SERVICE <https://query.wikidata.org/sparql> 
    {
      ?booksP1433 wdt:P585 ?booktime
    }
  FILTER (!BOUND(?startvilde) || ?startvilde <= ?booktime)
  #FILTER (?startvilde <= ?booktime)
  FILTER (!BOUND(?endvilde) || ?endvilde >= ?booktime)
  #FILTER (?endvilde >= ?booktime)
  ?file wdt:P1433 ?booksP1433.
  ?file schema:contentUrl ?url. 
  bind(iri(concat("http://commons.wikimedia.org/wiki/Special:FilePath/", wikibase:decodeUri(substr(str(?url),53)))) AS ?image)
} 
order by ?name ?startvilde

Boken Tvåkammar-riksdagen 1867–1970

Använder exempelvis

image image

.....


Marked problematic parties with P5008 = Q120143028 = "Välfärden analyserad - parti"

image image
salgo60 commented 4 months ago

I guess you have party unknown and opolitisk like Q5973510 =. SWERIK i-NF7MCicvARd9vxeawxGy59 = SBL 10191

image

opolitisk

modelled in WD Q4961261#P102

image image

a bad pattern used I think is done in wikidata with Elsa Widding Q98545439#P102 - sv

image image image

Einzelbewerber - Q1309957

image