Open biodfrl89 opened 4 years ago
Just as additional information, this is related to #115.
You can find the changes in output from 3.6 to 4.0 here: https://github.com/zkamvar/postmaul/blob/master/analysis.md#lesson-r-novice-gapminder-es
Here are the changes in the specific episodes:
https://github.com/zkamvar/postmaul/blob/master/data/diffs/r-novice-gapminder-es--05-data-structures-part2.diff https://github.com/zkamvar/postmaul/blob/master/data/diffs/r-novice-gapminder-es--13-dplyr.diff
I hope this helps.
I have identify in Lesson 5 "Explorando data frames" that, when loading the gapminder dataset:
gapminder <- read.csv("data/gapminder-FiveYearData.csv")
the parameter stringsAsFactor is never used. It's ok for the main lessons, because when they invoke str(gapminder) it is shown that country and continent are characters vector. But in the Solution to Challenge 4, where the student must analyze the output from str(gapminder), it is annotated that country and continent are factors. But they are not, they are character vectors.
In order to solve this, the read.csv( ) must be used with stringsAsFactor = TRUE, or change the Solution to Challenge 4 to say that country and continent are characters vector.
Something similar happen in Lesson 13 "Manipulación de data frames con dplyr". The gapminder dataset, used in previous lessons, is processed using:
gdp_bycontinents <- gapminder %>% group_by(continent) %>% summarize(mean_gdpPercap=mean(gdpPercap))
however, when gdp_bycontinents variable is called, the output say that continen is \<fctr>, but it should say \<chr>. Again is not clear if the original dataset is loaded via read.csv( ), using the parameter stringsAsFactor =TRUE or not.
Curiously, on Lesson 14 "Manipulación de data frames usando tidyr", a wide version of gapminder dataset is loaded, and until this lesson it is shown that the parameter stringsAsFactors = FALSE is used and it is explained why.
gap_wide <- read.csv("data/gapminder_wide.csv", stringsAsFactors = FALSE)
In general, these kind of situations tell me that maybe the read.csv( ) function was updated and the stringsAsFactors parameter default value was changed from TRUE to FALSE, not having the necessity to specify it in Lesson 14, but also altering some outputs from lessons 5 and 13.