UBC-MDS / data-analysis-review-2022

0 stars 1 forks source link

Submission: Group 21: Eurovision Rank Analysis #8

Open DanielCairns opened 1 year ago

DanielCairns commented 1 year ago

Submitting authors: @DanielCairns, @mrnabiz, @THF-d8, @Hawknum

Repository: https://github.com/UBC-MDS/eurovision_contest_rank_analysis Report link: https://htmlpreview.github.io/?https://github.com/UBC-MDS/eurovision_contest_rank_analysis/blob/main/doc/report.html Abstract/executive summary:

Eurovision is an annual singing contest that takes place in Europe where each participating country is represented by a contestant performing a song of their choice, and the country which gains the highest number of votes in the final (which is ranked the highest) is elected to be the winner.

In this project, we are going to explore if there is any association between the running order and the rank of a contestant in Eurovision. Does the country that performs the last rank higher than the country that performs the first? We are interested in this question because order can potentially have a large effect on the outcome of a competition. For instance, Glejser and Heyndels (2001) have shown that the contestants who perform later tend to gain a higher rank from the jury in the Queen Elisabeth Contest. This question is crucial because it is related to the bias in voting and fairness of competitions.

Editor: @flor14 Reviewer: Lisha Guo, Eyre Hong, Sarah Abdelazim, Nikita Susan Easow

eyrexh commented 1 year ago

Data analysis review checklist

Reviewer: @eyrexh

Conflict of interest

Code of Conduct

General checks

Documentation

Code quality

Reproducibility

Analysis report

Estimated hours spent reviewing: 1.5hrs

Review Comments:

Highlight points:

  1. Interesting analyzing questions and a great background introduction.
  2. The visualization in eda is nice and easy to follow.
  3. Considering different contest situations and doing six t-tests is thoughtful.
  4. Nice result writing and story-telling in the final report.

Need to improve:

  1. Better to move yaml file installation before the usage for people to easily follow the steps.
  2. The usage examples do not work in the root directory. They can only work in the src folder.
  3. Better to move eda file to the doc folder, not to the src folder.
  4. Although using six t-tests is thoughtful, I suggest clearly highlighting the process of dividing data into two groups: "first vs. rest" and "last vs. rest" in three different stages in the eda and final report. For example, you can use bullet points etc.
  5. Adding more plots, like the t test distribution of the mean rank with the p-value and thresholds in the final report will be more persuasive.

Overall it is a great project but needs to improve the folder organization and README usage introduction.

Attribution

This was derived from the JOSE review checklist and the ROpenSci review checklist.

nik11susan commented 1 year ago

Data analysis review checklist

Reviewer: @nik11susan

Conflict of interest

Code of Conduct

General checks

Documentation

Code quality

Reproducibility

Analysis report

Estimated hours spent reviewing: 1

Review Comments:

  1. This project has a very simple and interesting question - one that I was certainly interested to know the answer to!
  2. The preprocess file can be modified to have the author names and date mentioned.
  3. Choosing R for the inference and python for the rest of the scripts was a good decision in playing to the strengths of each programming language.
  4. I do agree with @eyrexh regarding the usage file - the paths to the scripts should be relative to the main directory.
  5. Since you had an intuition about there being a difference in the rank if the running order was first or last - bringing that up in the EDA could add a lot of value. Consider adding a plot for each of the following relations - first_to_perform - vs - rank and last_to_perform - vs - rank. I realize that these are categorical columns - hence two boxplots (0 and 1) for each relation (categorical column) showing us the mean rank would do the trick.
  6. Since the question is very precise, there could be a certain limitation to its scope. You could branch out and see other factors that could impact the ranking. I would really like to know if is_host_country plays any role in the ranking!
  7. The report can include the assumptions and limitations of doing a t-test.

Attribution

This was derived from the JOSE review checklist and the ROpenSci review checklist.

missarah96 commented 1 year ago

Data analysis review checklist

Reviewer: @missarah96

Conflict of interest

Code of Conduct

General checks

Documentation

Code quality

Reproducibility

Analysis report

Estimated hours spent reviewing: 2 hours

Review Comments:

Attribution

This was derived from the JOSE review checklist and the ROpenSci review checklist.

DanielCairns commented 1 year ago

5 pieces of feedback to address

  1. (Milestone 1) Team work contract was pushed to GitHub
  2. (Milestone 2) Proposal.md is missing
  3. (Milestone 3) [in Makefile] Target all could be simplified by writing something like all: doc/report.html
  4. (@eyrexh) The usage examples do not work in the root directory. They can only work in the src folder.
  5. (@missarah96) Motivation for why the analysis was done on the overall data, semi-final data and final data is missing. If highlighting differences between semi-final and final data is important, why do the analysis on overall data then?
RenzoWijn commented 1 year ago

Addressed feedback and corresponding commits:

  1. Removed team work contract: https://github.com/UBC-MDS/eurovision_contest_rank_analysis/commit/5d9ef61b7438de5fbbfcdeae6df49b7adcab7057#diff-7ce61eeb0bca7bb207668a43a634db11739613fb164f06b3db55172a29adeba7
  2. Added proposal.md: https://github.com/UBC-MDS/eurovision_contest_rank_analysis/commit/9b12d633b9fd88a90ea70212c53b43e6ad5d5077
  3. Target all simplified: doc/report.html: https://github.com/UBC-MDS/eurovision_contest_rank_analysis/commit/db972e52bf52278a7de192ce486a96dabfd44c2f
  4. Fixed usage section of readme: https://github.com/UBC-MDS/eurovision_contest_rank_analysis/commit/20eed0d4a8ccb9f7c9e1ca9e269c81d6dedb4ce0
  5. Added motivation to report: https://github.com/UBC-MDS/eurovision_contest_rank_analysis/commit/19613c332acd29054db3c11f1eb339515047203c