Closed rkmeade closed 3 years ago
Hi @rkmeade, thank you for raising this issue. Using the code from the lesson, it runs as expected. See reproducible example below. However, I can replicate your issue by using read.csv()
(base R) instead of read_csv()
(tidyverse, used in the lesson). This is an easy to make mistake that has been brought up a few times (e.g., #710). We could probably do a better job at preventing this.
read_csv()
## Loading the survey data
# modified slightly, for reprex to work
library(tidyverse)
surveys <- read_csv("https://ndownloader.figshare.com/files/2292169")
#>
#> -- Column specification --------------------------------------------------------
#> cols(
#> record_id = col_double(),
#> month = col_double(),
#> day = col_double(),
#> year = col_double(),
#> plot_id = col_double(),
#> species_id = col_character(),
#> sex = col_character(),
#> hindfoot_length = col_double(),
#> weight = col_double(),
#> genus = col_character(),
#> species = col_character(),
#> taxa = col_character(),
#> plot_type = col_character()
#> )
# ...
## Factors
surveys$sex <- factor(surveys$sex)
# ...
### Renaming factors
plot(surveys$sex)
sex <- surveys$sex
levels(sex)
#> [1] "F" "M"
sex <- addNA(sex)
levels(sex)
#> [1] "F" "M" NA
head(sex)
#> [1] M M <NA> <NA> <NA> <NA>
#> Levels: F M <NA>
levels(sex)[3] <- "undetermined"
levels(sex)
#> [1] "F" "M" "undetermined"
head(sex)
#> [1] M M undetermined undetermined undetermined
#> [6] undetermined
#> Levels: F M undetermined
plot(sex)
read.csv()
## Loading the survey data
# modified slightly, for reprex to work
library(tidyverse)
surveys <- read.csv("https://ndownloader.figshare.com/files/2292169")
# ...
## Factors
surveys$sex <- factor(surveys$sex)
# ...
### Renaming factors
plot(surveys$sex)
sex <- surveys$sex
levels(sex)
#> [1] "" "F" "M"
sex <- addNA(sex)
levels(sex)
#> [1] "" "F" "M" NA
head(sex)
#> [1] M M
#> Levels: F M <NA>
levels(sex)[3] <- "undetermined"
levels(sex)
#> [1] "" "F" "undetermined"
head(sex)
#> [1] undetermined undetermined
#> [6]
#> Levels: F undetermined
plot(sex)
Created on 2021-07-05 by the reprex package (v2.0.0)
Hi Maintainers!
A quick comment on the "Renaming factors" section, which does not work on my console the same way the episode says that it should.
Beginning with the first command, plot(surveys$sex), my console plots the ~1700 missing values as their own column (which is not reflected on the plot in the episode). I believe this is because instead of NA, it recognizes a third category of values, designated "".
In the next set of commands: sex <- surveys$sex levels(sex)
The lesson says the output should be: [1] "F" "M"
This is what I get: [1] "" "F" "M"
In the next code block, a new category for missing values is added: sex <- addNA(sex) levels(sex)
The lesson says the output should be: [1] "F" "M" NA
My output now has two equivalents of missing values: [1] "" "F" "M" NA
I believe all downstream errors can be remediated by running this before the initial plot command: levels(sex)[1] <- NA
I hope this is helpful!
-- Rachel