clarin-eric / ParlaMint

ParlaMint: Comparable Parliamentary Corpora
https://clarin-eric.github.io/ParlaMint/
41 stars 52 forks source link

Party orientation encoding? #184

Closed TomazErjavec closed 11 months ago

TomazErjavec commented 2 years ago

In ParlaMint II we will also need to add the left/centre/right orientation of political parties. This issue is to discuss how to encode this information. I suggest that it is encoded within all the org elements with role="politicalParty" or role="politicalGroup" using the state element, which would look something like this:

<state type="orientation" key="centre">
 <label>Orientation</label>
 <desc>Centre</desc>
</state>

@matyaskopp, would you agree?

matyaskopp commented 2 years ago

Agree? I am not sure. Definitely, I am not happy about implementing orientation - it is another (together with coalitions and positions) vague element that we(you :-)) want to implement.

E.g., in the case of populistic parties, it is impossible to determine their orientation. We have a political party that starts as "we are right-oriented with social awareness"(claim of the party "owner" in 2014) and now it looks like a left-oriented party (they steal social democrats and communist voters - these parties are no longer in our chamber of deputies).

So I am not sure about making this obligatory. I have a very limited Czech perspective but I guess there will be similar issues in other parliaments.

TomazErjavec commented 2 years ago

Yes, it is a tricky one, I agree. But, yes, we said we would do it, so we have to now. And maybe this could be an opportunity for @katjameden, as it would be possible to have several people annotate the orientation, compute interannotator agreement, discuss the problems etc., and voila, a PhD chapter is born. Katja, what do you think?

As for "right-oriented with social awareness (claim of the party "owner" in 2014) and now it looks like a left-oriented party" means that the temporal info would be useful.

But, yes, there will be many fuzzy cases. Then again, the same holds for sentiment annotation, and that doesn't stop people from doing it.

matyaskopp commented 2 years ago

I have been thinking and studying a bit, so now I believe that I am able to categorize somehow our political parties. I have two suggestions

  1. use more precise scaling: left left-center center right-center right
  2. use other dimension in political compas, authority–liberty dimension: image but I am not sure of scaling on authority–liberty dimension.
katjameden commented 2 years ago

Yes, it is a tricky one, I agree. But, yes, we said we would do it, so we have to now. And maybe this could be an opportunity for @katjameden, as it would be possible to have several people annotate the orientation, compute interannotator agreement, discuss the problems etc., and voila, a PhD chapter is born. Katja, what do you think?

Hi! Yes, I think this would be great!

coltekin commented 2 years ago

Joining a bit late, but in case you are not aware of them, there are a few large-scale "expert surveys" that we can rely on annotating political views of most of the parties in the ParlaMint. Two recent ones I am aware of are:

Although "populism" is in the focus in both cases, both include questions regarding the basic ideologies of the parties. Both data sets are available. Wikipedia also includes statements of ideologies for parties, but may not be reliable in all cases.

If we take the values from these sources, it may make sense to rethink the structure - including multiple values from multiple sources, and perhaps an aggregate value. The surveys above also got answers from multiple experts, sometimes expressed with a number in a (Likert) scale.

TomazErjavec commented 2 years ago

Thank you @coltekin, these are very valuable data sources and references, much better than trying to annotate this ourselves. I add @katjameden to this tread, as hopefully the encoding of party orientation will be a topic of her PhD.

TomazErjavec commented 2 years ago

To determine left/centre/right orientation of a political party, we have decided to use (where possible) the orientation of the party according to their membership in the European parliament, failing that the Wikipedia:

vaidasmo commented 2 years ago

Classification according to Europarliament groups might be problematic. Not all parties from the national parliament are present in the European, and vice versa. We should use multiple sources. One additional and very authoritative is here: https://www.chesdata.eu/ches-europe.

TomazErjavec commented 2 years ago

@vaidasmo you are right that EU groups do not cover every party - however, we did think of falling back to Wikipedia for those cases. But that for the additional source, we will look into it.

katjameden commented 2 years ago

I looked into the resource - the dataset provides us with a pretty solid estimation of political orientation for the political parties included in the survey on a scale of 0 - 10 (0 = Extreme left ... 5 = Center ... 10 = Extreme right), positioning it on a spectrum. The fact that the surveys were completed by experts (political scientists), makes the data more reliable.

I agree with @vaidasmo that we should use multiple resources and would suggest using this resource instead of Wikipedia, as it offers a more concrete reasoning/methodology for determining the political orientation of a party.

jureskubic commented 2 years ago

I checked the resource as well. I think that if we decide to use this one then we need to come to an agreement as to how we will determine ideological position of the party. In the codebook (https://static1.squarespace.com/static/5975c9bfdb29d6a05c65209b/t/5fa04ec05d3c8218b7c91450/1604341440585/2019_CHES_codebook.pdf) the ideology is numerically annotated (0 = extreme left, 5 = center, 10 = extreme right) and then in .csv file the figures are very precise (e.g. 2,083333). So we need to determine the parameters that we'll use for ideology assignment. Will we for example say that between 0-5 = left, 5 = centre and between 5-10 = right?

This codebook is extremely useful for party_id check according to the country. The .csv file is not very user friendly especially with only abbreviations of parties included. So it will take more work, but it could be useful; but I also think that Wikipedia could provide a useful fall-back option. Or perhaps start with Wikipedia which is more user-friendly and use this file as a fall-back.

katjameden commented 2 years ago

I checked the resource as well. I think that if we decide to use this one then we need to come to an agreement as to how we will determine ideological position of the party. In the codebook (https://static1.squarespace.com/static/5975c9bfdb29d6a05c65209b/t/5fa04ec05d3c8218b7c91450/1604341440585/2019_CHES_codebook.pdf) the ideology is numerically annotated (0 = extreme left, 5 = center, 10 = extreme right) and then in .csv file the figures are very precise (e.g. 2,083333).

This is exactly why this resource would help us position the political orientation of the party more precisely than others - rather than having fixed categories, we have them positioned on a left-right spectrum (e.g., score of 4.356 would tell us that the political party is left or even centre-left).

So we need to determine the parameters that we'll use for ideology assignment. Will we for example say that between 0-5 = left, 5 = centre and between 5-10 = right?

If we assume the categories to be left, right and centre, then yes. But since this dataset can give us a better estimation of a political orientation, we could also change the granularity of the scale, adding centre-left and centre-right categories. I would suggest keeping left, right and centre for now.

This codebook is extremely useful for party_id check according to the country. The .csv file is not very user friendly especially with only abbreviations of parties included. So it will take more work, but it could be useful; but I also think that Wikipedia could provide a useful fall-back option. Or perhaps start with Wikipedia which is more user-friendly and use this file as a fall-back.

Yes, I support Wikipedia being the fall-back option rather than the other way around. I do understand that this also means more work, but it will add to the accuracy of data and methodology.

jureskubic commented 2 years ago

I checked the resource as well. I think that if we decide to use this one then we need to come to an agreement as to how we will determine ideological position of the party. In the codebook (https://static1.squarespace.com/static/5975c9bfdb29d6a05c65209b/t/5fa04ec05d3c8218b7c91450/1604341440585/2019_CHES_codebook.pdf) the ideology is numerically annotated (0 = extreme left, 5 = center, 10 = extreme right) and then in .csv file the figures are very precise (e.g. 2,083333).

This is exactly why this resource would help us position the political orientation of the party more precisely than others - rather than having fixed categories, we have them positioned on a left-right spectrum (e.g., score of 4.356 would tell us that the political party is left or even centre-left).

So we need to determine the parameters that we'll use for ideology assignment. Will we for example say that between 0-5 = left, 5 = centre and between 5-10 = right?

If we assume the categories to be left, right and centre, then yes. But since this dataset can give us a better estimation of a political orientation, we could also change the granularity of the scale, adding centre-left and centre-right categories. I would suggest keeping left, right and centre for now.

This codebook is extremely useful for party_id check according to the country. The .csv file is not very user friendly especially with only abbreviations of parties included. So it will take more work, but it could be useful; but I also think that Wikipedia could provide a useful fall-back option. Or perhaps start with Wikipedia which is more user-friendly and use this file as a fall-back.

Yes, I support Wikipedia being the fall-back option rather than the other way around. I do understand that this also means more work, but it will add to the accuracy of data and methodology.

OK, good. I was thinking about how thoroughly we want to determine ideological positioning since I think Tomaž mentioned that we're just making left/center/right. I can start with .csv for Slovenia and check in Wikipedia how if the information there correlate with the ones in csv.

TomazErjavec commented 2 years ago

I agree that we should stick to CHES as much as possible, as this does indeed give us a very simple method. However, it covers only parties till end of 2019, so not all parties will be found there, so we do need the Wikipedia backup. This is also why I would stick to L/C/R categories.

I had a look at the CSV, and there are some complications (apart from the obvious ones with numbers for countries etc.), esp. that this is a longitudinal study, so one party is listed several times with different values, something we will need to take into account, most likely by having several orientations for one party, at least if they ever change their LRC class.

I had a closer look at SI parties, and the results are somewhat unexpected, e.g. SDS having a max score of 6.40 and min of only 5.13, even though it is percieved as being pretty far right wing here. Also, SNS is 3 and lower, even though I would consier it extreme right rather than extreme left. But so be it, it does mean that the center label should really cover only a small interval, maybe 4.90 - 5.10.

Also, looking at the CSV, it looks like it will be rather a nightmare to input things by hand, and we'd be going only from one table into another. But it does occur to me that a bit of Python could transform the table into something that we could then directly merge into the corpus. What would be needed is to change the CHES country and period codes into ISO country and date formats, and get rid of all the columns were are not interested in, and transform the number into the class. Maybe also merge parties that don't vary too much in their score (i.e. they would always get the same label). @katjameden, would you be up to it?

katjameden commented 2 years ago

I agree that we should stick to CHES as much as possible, as this does indeed give us a very simple method. However, it covers only parties till end of 2019, so not all parties will be found there, so we do need the Wikipedia backup. This is also why I would stick to L/C/R categories.

Agreed.

I had a look at the CSV, and there are some complications (apart from the obvious ones with numbers for countries etc.), esp. that this is a longitudinal study, so one party is listed several times with different values, something we will need to take into account, most likely by having several orientations for one party, at least if they ever change their LRC class.

I had a closer look at SI parties, and the results are somewhat unexpected, e.g. SDS having a max score of 6.40 and min of only 5.13, even though it is percieved as being pretty far right wing here. Also, SNS is 3 and lower, even though I would consier it extreme right rather than extreme left. But so be it, it does mean that the center label should really cover only a small interval, maybe 4.90 - 5.10.

Is this the correct variable? The values for lrgen variable (position of the party in terms of its overall ideological stance) e.g., for SDS in 2019 is 8.64 (making it pretty right-winged), 8.69 for 2014, 6.9 for 2010 ... with lowest score of 6.4 in 2002. Link to the CHES stats: https://www.chesdata.eu/ches-stats

image

Also, looking at the CSV, it looks like it will be rather a nightmare to input things by hand, and we'd be going only from one table into another. But it does occur to me that a bit of Python could transform the table into something that we could then directly merge into the corpus. What would be needed is to change the CHES country and period codes into ISO country and date formats, and get rid of all the columns were are not interested in, and transform the number into the class. Maybe also merge parties that don't vary too much in their score (i.e. they would always get the same label). @katjameden, would you be up to it?

Yes, I can try :)

TomazErjavec commented 2 years ago

I had a closer look at SI parties, and the results are somewhat unexpected, e.g. SDS having a max score of 6.40 and min of only 5.13

Ups, looks like I messed something up in importing the CSV, my bad. Yes, the scores make a lot more sense now, thanks!

Yes, I can try :)

Great!

TomazErjavec commented 1 year ago

We have now more or less completed the documentaton and infrastructure for this task:

Accepting comments to any of the above!

What still remains to be done is documenting the new PoliticalOrientation taxonomy, adding translations to its terms, and, of course, actually adding the political orientation once the TSV files are prepared from the corpora (once they are delivered).

starkadur commented 1 year ago

I was going to fetch Orientation-IS.tsv and edit it but then I noticed it was edited 22 hours ago. The file is just like I would have it except I was going to write BT for the Pirate Party (which has syncretic as a political position - may BT is not accurate in that case). 'U' in ParlaMint-IS stands for "outside political party" and is used in those cases where a MP does not belong to any party. Should I leave the file as it is?

TomazErjavec commented 1 year ago

I was going to fetch Orientation-IS.tsv and edit it but then I noticed it was edited 22 hours ago. The file is just like I would have it except I was going to write BT for the Pirate Party (which has syncretic as a political position - may BT is not accurate in that case).

@starkadur, @jureskubic is edtiting the V2 country files - I imported them to GutHub, but did notice a few errors, so he will fix those countries. But I think this IS was not one of then, i.e. it is ok as far as we are concered.

'U' in ParlaMint-IS stands for "outside political party" and is used in those cases where a MP does not belong to any party.

Ah. The people that are not members of any party should simply be not affiliated with any party. So, maybe you could delete the "U" party from your listOrg, and remove the affiliation to it from the people that have it.

Should I leave the file as it is?

For now, yes. When the new corpora are submitted, I am sure there will be a few parties more (given the longer time span), and that would be a good time for you to edit it, including BT. Will let you - and everybody else - know when the time comes!

TomazErjavec commented 11 months ago

The pary orientation encoding is now fixed and explained in the Guidelines so, closing this issue.