alperyilmaz / dav-exercises

Exercise questions submitted by Data Analysis and Visualization with R course students at YTU
GNU General Public License v3.0
1 stars 2 forks source link

Titanic #22

Open baslamisli opened 6 years ago

baslamisli commented 6 years ago

Question

RMS Titanic was a British passenger liner that sank in the North Atlantic Ocean in the early morning hours of 15 April 1912, after it collided with an iceberg during its maiden voyage from Southampton to New York City. There were an estimated 2,224 passengers and crew aboard the ship, and more than 1,500 died, making it one of the deadliest commercial peacetime maritime disasters in modern history. The RMS Titanic was the largest ship afloat at the time it entered service and was the second of three Olympic-class ocean liners operated by the White Star Line. The Titanic was built by the Harland and Wolff shipyard in Belfast. Thomas Andrews, her architect, died in the disaster. (source : Wikipedia) The data is an grouped version of the 1912 Titanic passenger survival

Accordingly, please use the given data to find the ticket class with the maximum number of passengers. Create a chart like this:

image

Please import file from : https://raw.githubusercontent.com/vincentarelbundock/Rdatasets/master/csv/COUNT/titanicgrp.csv

HİNT:

Your answer should be like this:
      survive      class
1    .........     ........

Please open a new text file in your Desktop and paste the contents and save the file as .csv in Desktop.
Please use setwd()

library(ggplot2)
library(readr)
library(dplyr)
titanic <- read.csv("titanic.csv")
titanic %>%
  select(survive, class) %>%
  arrange(desc(survive)) %>%
  top_n(1, survive)
titanic %>%
  ggplot(aes(x=class, y=survive, fill=class)) + 
  geom_col() +
  labs(x="Ticket class", y="Number of passengers who survived")

Additional information

survive: Number of passengers who survived cases: Number of passengers with same pattern of covariates age: 1=adult; 0=child sex: 1=Male; 0=female class: Ticket class

Originality

Is this question

Difficulty Level

Tags

import , dplyr, ggplot2

alperyilmaz commented 6 years ago
baslamisli commented 6 years ago

I tried to fix it. Is there any problem ?

alperyilmaz commented 6 years ago

can be accepted as "very easy" question, worth of 0.5 or 0.25 points.

In many of your questions, there are two parts, "find the top" and "draw a chart". If you really want the students to calculate both then there's no problem but a single question which will force students to use combinations of select, filter, group_by and summarise would be better.

If you are asking the highest or lowest value, please try to have some combination of dplyr verbs. Just one top_n won't be beneficial to students.