Open utterances-bot opened 3 years ago
Haha, so ggplot is 'fancier' :) In the last exercise, interpreting the difference between the minimum and maximum from the scatterplot is a bit difficult, as they are in fact on the same scale: e.g. the higher the maximum and the lower the minimum, the larger the gap is, and the other way around. It would be more insightful to show the change in the gap over the years, with the range (max - min) on the y axis and years on the x axis.
Or, for a better understanding of the differences, some of the other stats such as the mean or median (less sensitive to outliers than the mean) or the standard deviation (showing the distribution of the data, also sensitive to outliers), could be displayed on the y axis. Continents can be shown as a third variable with colours on the same plot.
It's nice how useful plots are for answering and raising questions in exploratory data analysis
@cforgaci Nice eye for visualisation and making plots that are readily understandable. For some really "fancy" plotting libraries, check out https://plotly.com/python/ and https://docs.bokeh.org/en/latest/index.html :D
@aecryan nice!
Bokhe library allows you to make interactive plots. Matplotlib on the other hand creates images.
For exercises like this, I get stuck because I do not know which kind of things I can fill in. I understand it is part of the learning process, but I do not even know if it is a function, number etc. as my Python vocabulary is still very small. data_europe = pd.read_csv('data/gapminder_gdp_europe.csv', index_col='country') data_europe.__.plot(label='min') data_europe.__ plt.legend(loc='best') plt.xticks(rotation=90)
Does / or \ in paths (i.e., file locations) matter?
Where in the code is the statement that it should read from the dataset and plot "per capita among the countries in Asia for each year in the data set" (as instructed)? What if you want to show it for one year or several years?
data_asia = pd.read_csv('data/gapminder_gdp_asia.csv', index_col='country') data_asia.max().plot() print(data_asia.idxmax()) print(data_asia.idxmin())
@ThoTUM86 you are right, it is all part of the learning process :) In this case, min() and max() are functions that you can call on a dataset or series. In this case, you are doing the calculation of the minimum GDP per capita over all countries, and plotting this information in one step. .plot() is a method that can be applied to a data object. The terminology can definitely be confusing. Here's an article with more explanation if you are interested: https://realpython.com/python-matplotlib-guide/
As for forward and back slashes, the short answer is they can matter. If you are using windows, your file path in your system is represented by \, but on other OS's they are mostly all forward slashes /. If you want your code to work across OS's, you'll need to deal with this issue. For code that only runs on your machine, you can leave it as is. Here's again a good explanation of the issue: https://medium.com/@ageitgey/python-3-quick-tip-the-easy-way-to-deal-with-file-paths-on-windows-mac-and-linux-11a072b58d5f
In R they also call ggplot2 more fancy - anything would be actually compared to their basic plots:) In R though the syntax of ggplot is totally different, as a chained command, not stacked like here. @aecryan and @jurra nice to know about the other visualization libraries
Plotting — Python essentials for GIS learners
https://the-magnificents.github.io/04-02-2021-Carpentry-for-HGIS/02_Day_2_Python_GIS/exercise/B3_Exercise.html