oldoc63 / learningDS

Learning DS with Codecademy and Books
0 stars 0 forks source link

Different Plot Types #485

Open oldoc63 opened 1 year ago

oldoc63 commented 1 year ago

In this lesson, you'll learn how to create and when to use different types of plots.

oldoc63 commented 1 year ago

Simple Bar Chart

The plt.bar function allows you to create simple bar charts to compare multiple categories of data. Some possible data that would be displayed with a bar chart:

You call plt.bar with two arguments:

In most cases, we will want our x-values to be a list that looks like [0, 1, 2, 3 ...] and has the same number of elements as our y-values list. We can create that list manually, but we can also use the following code:

oldoc63 commented 1 year ago

The range function creates a list of consecutive integers (i.e., 0, 1, 2, 3, ...]). It needs an argument to tell it how many numbers should be in the list. For instance, range(5) will make a list with 5 numbers. We want our list to be as long as our bar heights in out example. len(heights) tell us how many elements are in the list heights.

Here is an example of how to make a bar chart using plt.bar to compare the number of days in a year on the different planets:

oldoc63 commented 1 year ago
  1. We are going to help the cafe MatplotSip analyze some of the sales data they have been collecting. We have included a list of drink categories and a list of numbers representing the sales of each drink over the past month. Use plt.bar to plot numbers of drinks sold on the y-axis. The x-values of the graph should just be the list [0, 1 ... , n-1], where n is the number of categories (drinks), we are plotting. So at x=0, we'll have the number of cappuccinos sold.
  2. Show the plot and examine it. At this point, we can't tell which bar corresponds to which drink, so this chart is not very helpful.
oldoc63 commented 1 year ago

When we create a bar chart, we want each bar to be meaningful and correspond to a category of data. In the drinks chart from the last exercise, we could see that sales were different for different drink items, but this wasn't very helpful tous, since we didn't know which bar corresponded to which drink. We learned how to customize the tick marks on the x-axis in three steps: -Steps to customize the tick marks

  1. Create an axes object ax = plt.subplot()
  2. Set the x-tick positions using a list of numbers ax.set_xticks([0, 1, 2, 3, 4, 5, 6, 7, 8])
  3. Set the x-tick labels using a list of strings ax.set_xticklabels(['Mercury', 'Venus', 'Earth', 'Mars', 'Jupiter', 'Saturn', 'Uranus', 'Neptune', 'Pluto'])
  4. If your labels are particularly long, you can use the rotation keyword to rotate your labels by a specified numbers of degrees: ax.set_xticklabels(['Mercury', 'Venus', 'Earth', 'Mars', 'Jupiter', 'Saturn', 'Uranus', 'Neptune', 'Pluto'], rotation =30)
oldoc63 commented 1 year ago

We have to set the x-ticks before we set the x-labels because the default ticks won't necessarily be one tick per bar, especially if we are plotting a lot of bars. If we skip setting the x-ticks before the x-labels, we may end up with labels in the wrong place. Remember that we can label de x-axis(plt.xlabel) and the y-axis(plt.ylabel) as well, making our graph much easier to understand.

oldoc63 commented 1 year ago
  1. The list drinks represents the drinks sold at MatplotSip. We are going to set x-tick labels on the chart you made with plt.bar in the last exercise. First, create the axes object for the plot and store it in a variable called ax.
  2. Set the x-axis to be the numbers from 0 to the length of drinks.
  3. Use the strings in the drinks list for the x-axis ticks of the plot you made with plt.bar.
oldoc63 commented 1 year ago

Side-By-Side Bars

We can use a bar chart to compare two sets of data with the same types of axis values. To do this, we plot two sets of bars next to each other, so that the values of each category can be compared. For example, here is a chart with side-by-side bars for the populations of the United States and China over the age of 65 (in percentages):

oldoc63 commented 1 year ago

Some examples of data that side-by-side could be useful for include:

In the graph above, there are 7 sets of bars, with 2 bars in each set. Each bar has a width of 0.8 (the default width for all bars in Matplotlib).

This is a lot of math, but we can make Python do it for us by copying and pasting this code:

# China Data (blue bars)
n = 1  # This is our first dataset (out of 2)
t = 2 # Number of datasets
d = 7 # Number of sets of bars
w = 0.8 # Width of each bar
x_values1 = [t*element + w*n for element
             in range(d)]

That just generated the first set of x-values. To generate the second set, paste the code again, but change n to 2, because this is the second dataset:

# US Data (orange bars)
n = 2  # This is our second dataset (out of 2)
t = 2 # Number of datasets
d = 7 # Number of sets of bars
w = 0.8 # Width of each bar
x_values2 = [t*element + w*n for element
             in range(d)]

Let's examine our special code:

[t*element + w*n for element in range(d)]

This is called a list comprehension. It's a special way of generating a list from a formula. For making side-by-side bar graphs, you'll never need to change this line; just paste into your code and make sure to define n, t, d, and w correctly.

oldoc63 commented 1 year ago
  1. The second location of MatplotSip recently opened up, and the owners want to compare the drink choices of the clientele at the two different locations. To do this, it will be helpful to have the sales of each drink plotted on the same axes. We have provided sales2, a list of values representing the sales of the same drinks at the second MatplotSip location.
oldoc63 commented 1 year ago
  1. Make a list comprehension to generate the x-value sets (1 y 2).
oldoc63 commented 1 year ago
  1. Use the plt.bar to position the bars corresponding to sales1 on the plot. The x-values for plt.bar should be the store1_x list that you just created.
oldoc63 commented 1 year ago
  1. Use the plt.bar to position the bars corresponding to sales2 on the plot. The x-values for plt.bar should be the store2_x list that you just created.
oldoc63 commented 1 year ago
  1. Show the plot using plt.show().
oldoc63 commented 1 year ago

Stacked Bars

If we want to compare two sets of data while preserving knowledge of the total between them, we can also stack the bars instead of putting them side by side. For instance, if someone was plotting the hours they've spent on entertaining themselves with video games and books in the past week, and wanted to also get a feel for total hours spent on entertainment, they could create a stacked bar chart:

oldoc63 commented 1 year ago

We do this by using the keyword bottom. The top set of bars will have bottom set to the heights of the other set of bars. So the first set of bars is plotted normally and the second set of bars has bottom specified:

oldoc63 commented 1 year ago

This starts the book_hours bars at the heights of the video_game_hours bars. So, for example, on Monday the orange bar representing hours spent reading will start at a value of 1 instead of 0, because 1 hour was spent playing video games.

oldoc63 commented 1 year ago
  1. You just made a chart with two sets of sales data plotted side by side. Let's instead make a stacked bar chart by using the keyword bottom. Put the sales1 bars on the bottom and set the sales2 bars to start where the sales1 bars end.
oldoc63 commented 1 year ago
  1. We should add a legend to make sure we know which set of bars corresponds to which location. Label the bottom set of bars as "Location 1" and the top set of bars as "Location 2" and add a legend to the chart.