swcarpentry / r-novice-inflammation

Programming with R
http://swcarpentry.github.io/r-novice-inflammation/
Other
163 stars 395 forks source link

Lesson Contribution: Data Types and Structures and Understanding Factors #508

Open Talishask opened 3 years ago

Talishask commented 3 years ago

I'm a member of The Carpentries staff and I'm submitting this issue on behalf of another member of the community. In most cases, I won't be able to follow up or provide more details other than what I'm providing below.


As part of my checkout process, I would like to make contributions to Programming in R. I made some suggestions about the narratives, wording, and examples in two lessons: Data Types and Structures and Understanding Factors.

Lesson: Data Types and Structures

  1. Under the “Understanding Basic Data Types and Data Structures in R”.

R has many data structures. These include • atomic vector • list • matrix • data frame • factors

Suggested changes:

R has five different data structures, including: • Atomic vector • List • Matrix • Data frame • Factor

Vectors A vector is the most common and basic data structure in R and is pretty much the workhorse of R. Technically, vectors can be one of two types:

• atomic vectors • lists

Although the term “vector” most commonly refers to the atomic types not to lists.

Suggested changes:

The vector is the most commonly used data structure in R because it is intuitive for users to access and operate. There are two types of vector:

• Atomic vector • List

In most scenarios, people refer to the atomic vector, not the list, when discussing vectors.

Other Special Values

Suggested changes: Adding an example of negative infinity.

R

-1/0

Output

[Output] -Inf

Lesson: Understanding Factors

Sometimes, the order of the factor does not matter, other times you might want to specify the order because it is meaningful (e.g., “low”,”medium”,”high”) or it is required by particular type of analysis. Additionally, specifying the order of the levels allows us to compare levels:

R

food <- factor(c(“low”, “high”, “medium”, “high”, “low”, “medium”, “high”))
Levels(food)

Output

[1] “high”  “low”  “medium”

Suggested changes:

When the factor consists of characters, like “low”, ”medium”, and “high”, the output of the factor will be shown in an alphabetical order.

R

food <- factor(c(“low”, “high”, “medium”, “high”, “low”, “medium”, “high”))
levels(food)

Output

[1] “high”  “low”  “medium”

If we want the level to be shown in a specific order, we can add ordered = TRUE.

R

food <- factor(c(“low”, “high”, “medium”, “high”, “low”, “medium”, “high”))
food<-factor(food, levels = (“low”, ”medium”, ”high”), ordered = TRUE)
levels(food)

Output

[1] “low”  “medium”  “high”

Converting Factors

This section is a bit confusing. According to the example provided here, f<-factor(c(3.4, 1.2, 5)), f is an integer factor. Therefore, the following command as.numeric(f) does not add much. In fact, it confuses the audience and makes indexing harder

Suggested changes:

R

food <- factor(c(3.4, 1.2, 5))
typeof(f)

Output

[1] “integer”
Levels: 1.2, 3.4, 5

Remember, R outputs the level in an alphabetic order if the factor contains characters, and an ascending order if the factors contains numbers.

If we want to know what the 2nd number in f, we can index the factor by using [ ].

R

F[2]

Output

[1] 1.2
Levels: 1.2, 3.4, 5

We see 1.2 in the output. Notice that R also prints out the level in an ascending order. If we want to keep the original order, we can convert f to a character vector and a. All numbers will be treated as characters and will not be sorted by R.

R

f<- as.character(f)
f

Output

[1] “3.4”,”1.2”,”5”

HaoZeke commented 3 years ago

Hi @Talishask, these changes look great to me. Congratulations on completing the instructor training! Would you be willing to open a PR with these changes?

Talishask commented 3 years ago

Hi @HaoZeke - I've reached out to the contributor and let them know. Thanks