Open theaiuel opened 4 years ago
hi @theaiuel ,
the reason that your plot turns out weird is that the survived
variable is a dummy variable.
As you can see in the screenshot below the descriptive statistics for the survived
variable conditioned on the variables class
and sex
are not that useful. The boxplot basically visualizes these measures and as such the plot is not that informative.
The barplot is more useful in this case.
"A bar plot represents an estimate of central tendency for a numeric variable with the height of each rectangle and provides some indication of the uncertainty around that estimate using error bars." https://seaborn.pydata.org/generated/seaborn.barplot.html
As the mean is quite informative for a dummy variable this is the plot type to use here :)
When we make a box plot on the probability of survival for men and women within each passenger class it does not turn out nice. When we try there is no ‘second class’ and the plot is not informative. Our code is:
sns.boxplot(x='class', y='survived', hue='sex', data=titanic, ax=ax[1])