Open oldoc63 opened 1 year ago
The plt.bar function allows you to create simple bar charts to compare multiple categories of data. Some possible data that would be displayed with a bar chart:
You call plt.bar with two arguments:
In most cases, we will want our x-values to be a list that looks like [0, 1, 2, 3 ...] and has the same number of elements as our y-values list. We can create that list manually, but we can also use the following code:
The range function creates a list of consecutive integers (i.e., 0, 1, 2, 3, ...]). It needs an argument to tell it how many numbers should be in the list. For instance, range(5) will make a list with 5 numbers. We want our list to be as long as our bar heights in out example. len(heights) tell us how many elements are in the list heights.
Here is an example of how to make a bar chart using plt.bar to compare the number of days in a year on the different planets:
When we create a bar chart, we want each bar to be meaningful and correspond to a category of data. In the drinks chart from the last exercise, we could see that sales were different for different drink items, but this wasn't very helpful tous, since we didn't know which bar corresponded to which drink. We learned how to customize the tick marks on the x-axis in three steps: -Steps to customize the tick marks
ax = plt.subplot()
ax.set_xticks([0, 1, 2, 3, 4, 5, 6, 7, 8])
ax.set_xticklabels(['Mercury', 'Venus', 'Earth', 'Mars', 'Jupiter', 'Saturn', 'Uranus', 'Neptune', 'Pluto'])
ax.set_xticklabels(['Mercury', 'Venus', 'Earth', 'Mars', 'Jupiter', 'Saturn', 'Uranus', 'Neptune', 'Pluto'], rotation =30)
We have to set the x-ticks before we set the x-labels because the default ticks won't necessarily be one tick per bar, especially if we are plotting a lot of bars. If we skip setting the x-ticks before the x-labels, we may end up with labels in the wrong place. Remember that we can label de x-axis(plt.xlabel) and the y-axis(plt.ylabel) as well, making our graph much easier to understand.
We can use a bar chart to compare two sets of data with the same types of axis values. To do this, we plot two sets of bars next to each other, so that the values of each category can be compared. For example, here is a chart with side-by-side bars for the populations of the United States and China over the age of 65 (in percentages):
Some examples of data that side-by-side could be useful for include:
In the graph above, there are 7 sets of bars, with 2 bars in each set. Each bar has a width of 0.8 (the default width for all bars in Matplotlib).
This is a lot of math, but we can make Python do it for us by copying and pasting this code:
# China Data (blue bars)
n = 1 # This is our first dataset (out of 2)
t = 2 # Number of datasets
d = 7 # Number of sets of bars
w = 0.8 # Width of each bar
x_values1 = [t*element + w*n for element
in range(d)]
That just generated the first set of x-values. To generate the second set, paste the code again, but change n to 2, because this is the second dataset:
# US Data (orange bars)
n = 2 # This is our second dataset (out of 2)
t = 2 # Number of datasets
d = 7 # Number of sets of bars
w = 0.8 # Width of each bar
x_values2 = [t*element + w*n for element
in range(d)]
Let's examine our special code:
[t*element + w*n for element in range(d)]
This is called a list comprehension. It's a special way of generating a list from a formula. For making side-by-side bar graphs, you'll never need to change this line; just paste into your code and make sure to define n, t, d, and w correctly.
If we want to compare two sets of data while preserving knowledge of the total between them, we can also stack the bars instead of putting them side by side. For instance, if someone was plotting the hours they've spent on entertaining themselves with video games and books in the past week, and wanted to also get a feel for total hours spent on entertainment, they could create a stacked bar chart:
We do this by using the keyword bottom
. The top set of bars will have bottom set to the heights of the other set of bars. So the first set of bars is plotted normally and the second set of bars has bottom specified:
This starts the book_hours bars at the heights of the video_game_hours bars. So, for example, on Monday the orange bar representing hours spent reading will start at a value of 1 instead of 0, because 1 hour was spent playing video games.
bottom
. Put the sales1 bars on the bottom and set the sales2 bars to start where the sales1 bars end.
In this lesson, you'll learn how to create and when to use different types of plots.