filipvanlaenen / eopaod

European Opinion Polls as Open Data
GNU General Public License v3.0
26 stars 48 forks source link

Issues with Bulgarian polls #1077

Closed roshavagarga closed 3 years ago

roshavagarga commented 3 years ago

Based on the csv file here:

  1. Missing the latest Barometer poll.
  2. Some of the lines are out of order and you have the data from an Alpha Research poll from September 2020 twice.
  3. Some of the data on number of pollees is wrong, one example is the Market Links poll from 11-19 June 2019 - the actual pollees for the 'Who would you vote for' question are 429, as can be seen on the bottom in this presentation. This is probably an issue in other rows.
  4. You're missing instances of double polls for the same date. Example here, where you can see that some polls offered results from 'voters only' and from larger groups too.
  5. This is a general question - are you rating the trustability when you aggregate? Specter, CAM and Barometer lack any online presence and are often quoted by yellow journalism hubs, the latter especially seems to offer wildly inaccurate poll results, compared to the others. You can check their previous polls and compare them to election results afterwards and see the pattern quite easily.
filipvanlaenen commented 3 years ago
  1. Added the Barometer poll for the end of November. Thanks for notifying us!
filipvanlaenen commented 3 years ago
  1. Corrected, thanks for notifying us!
roshavagarga commented 3 years ago

@filipvanlaenen Thanks for the quick reply, a few more things to note:

  1. Current numbers for elected parliament members here is off, latest data is as follows: GERB - 95 BSP - 70 (= Socialist party, which campaigned as Coalition for Bulgaria and should be written as such) DPS - 25 (= Rights and Freedoms) OP - 21 (12 - VMRO / 9 - NFSB) Volya - 12 Independent - 17 I used the abbreviations as they are directly transliterated from Bulgarian, which should be the typical way to do so, instead of translating the names themselves. GERB and OP are the ruling majority, with Volya as an on-and-off support, while the rest are in opposition.

I'm not sure what that page means as far as 'Bulgarian National Movement' - I'm guessing that should be OP, which denotes United Patriots, the coalition which still holds 21 spots. One of the coalition members Ataka (Attack) left that parliament group and is now part of the Independents. The rules are such that unless a certain amount of Independents form their own group under a new name, they will continue to be classified as 'Independent', even if the members were elected from a certain party or coalition member's ballot.

  1. You've noted Изправи се.БГ az IS.B, which is slightly wrong, I'd say IS.BG is better, since that's the transliteration of the typical shortening for that coalition.

  2. Some polls feature results for a theoretical GERB & SDS coalition, namely the CAM one from 1-5 Aug 2020 and two Sova Harris polls from 19-25 Aug 2020 and 27Oct-3Nov 2020. Not sure how you'd account for that, but GERB and SDS were indeed in a coalition for the previous mayoral elections and might do so again at the upcoming 2021 election.

  3. If you have a native speaker, there's more data in the tables in this Wikipedia page. If not, the two buttons denote 'original data' and 'recalculated data', the default being the former. The recalculated data takes answers such as 'I am unsure' and 'I will not vote' out of the equation, but not 'I do not support any of the above', which is a valid option on the ballot. I'd also suggest taking a look at the code for that segment and using the appropriate color hex codes for the parties - they are derived from official sources and should be the best representation for said parties/coalitions.

  4. I'm guesstimating that the DP mentioned in the graphs here should be DB - Democratic Bulgaria. That's a coalition between 3 parties, easy to find out more through Wikipedia.

  5. There might be results for other parties you are missing - Republicans for Bulgaria and Bulgarian Summer, which can be shortened as RB/RzB and BL. Both parties have polled for under a percent so far.

  6. I'm not sure how you decide which parties should have a dotted line. If it is based on representation - Democratic Bulgaria have 1 representative in the European Parliament, while ITN have none, so those two should be flipped as far as which one gets to be the dotted line. OP should also not be a dotted line under said criteria.

  7. This one's just some explanations - as I said earlier, the United Patriots (OP) used to be a coalition of VMRO, NFSB and Ataka. Ataka left it around mid-2019, and polls have continued to use OP, OP (VMRO+NFSB) or, rarely, listed both parties separately. Out of the three, only VMRO seems to have the pull to reach the minimum 4% mark to get into parliament, so depending on future coalition plans, you might have to rename it or change the graphics accordingly.

  8. Bonus note - European Affiliation might vary when you're dealing with coalitions. A good example is Democratic Bulgaria - they're comprised of 3 parties, but their only representative in the EP is from one of said parties. The other 2 parties might have aligned differently, had one of their members been elected. Might be good to add that as a note somewhere?

roshavagarga commented 3 years ago

@filipvanlaenen I'm still waiting on an official answer. If you want some food for thought, even a quick machine translation of this RFE article will give you a base idea of how weird the Barometer pollster is - according to said article, that company has 1 representative that is the only person on the payroll, which makes it dubious at the very least whether they could operate as a pollster at all. I really hope I'll get an official response on this, because certain yellow medias in Bulgaria are making use of your poll aggregation to boost their favored parties' numbers and I think that you should definitely add some weight to the more trustworthy pollsters' results, especially since some of them have been way off in previous years - something you can easily calculate based off their poll results and official voting results.

roshavagarga commented 3 years ago

@saimontato Can I get some information on #1096? If they were removed for a lack of online presence and/or trustworthiness, the same should be done for the Barometer ones, but while I personally dislike both of these pollsters, as far as I'm aware they aren't doing anything illegal and they're conforming to the legal requirements - and trust me, it pains me to say that.

At most you should just weigh their results based on how close they were to the official election results and keep them there, something I asked for and @filipvanlaenen decided to ignore, among other things in this issue thread.