swcarpentry / r-novice-gapminder

R for Reproducible Scientific Analysis
http://swcarpentry.github.io/r-novice-gapminder/
Other
164 stars 537 forks source link

Lesson Contribution: R for Reproducible Scientific Analysis #725

Closed FarahZahir closed 2 years ago

FarahZahir commented 3 years ago

As part of my checkout process, I would like to make contributions to "R for Reproducible Scientific Analysis". I have made suggestions for three episodes: Introduction to R and R Studio, Creating Publication-Quality Graphics with ggplot2 AND Producing Reports with knitr.

Episode 1: Introduction to R and RStudio

1 Under "Work flow within RStudio" ->" Tips for running segments of your code", a few more short cuts for executing and formatting the code can be added such as:

Ctrl + Alt + R for running the whole script

Ctrl + Shift + A for complete reformat of the selected part of a code

Ctrl + Shift + C for commenting selected lines

2

Concept of working directory would be useful for novices, somewhere in this episode or beginning of the next episode.

3

Under "R packages" heading of this episode, directing to CRAN and GitHub for additional packages will be helpful.

Episode 8: Creating Publication-Quality Graphics with ggplot2

In this episode, under "Modifying text" section, the title of the ggplot can be centred by using this command before running the graph:

theme_update(plot.title = element_text(hjust = 0.5))

so the updated code will be like this:

theme_update(plot.title = element_text(hjust = 0.5)) #to centrally align the title ggplot(data = americas, mapping = aes(x = year, y = lifeExp, color=continent)) + geom_line() + facet_wrap( ~ country) + labs( x = "Year", # x axis title y = "Life expectancy", # y axis title title = "Figure 1", # main title of figure color = "Continent" # title of legend ) + theme(axis.text.x = element_text(angle = 90, hjust = 1))

Episode 15: Producing Reports With knitr

In this episodes when "Basic components of R Markdown" was explained, only instructions for creating a html document were given. These can be improved by providing additional instructions to knit as a word document. Additionally, automating to knit current date instead of manually writing date every time can be done in the following way:


title: "Updated Initial R Markdown document" author: "Farah Zahir" date: 'r format(Sys.time(), "%d %B, %Y")' output: word_document: toc: yes html_document: toc: yes

Thanks and Regards

jcoliver commented 3 years ago

Thanks for these suggestions, @FarahZahir . We appreciate you taking the time to provide potential improvements to the lessons.

  1. Adding shortcuts in a tip would be a way to include these in a lesson. I am wary of asking novice learners to remember a lot of keyboard shortcuts, so making the material optional is the way to go.
  2. Indeed, a more explicit discussion of what a working directory is is warranted.
  3. A tip acknowledging packages available on GitHub would be useful.
  4. Horizontal justification of titles is useful information for a tip or an exercise solution.
  5. Adding dynamic date to the YAML header, although it might be a good opportunity to encourage ISO date format YYYY-MM-DD. :wink:

If you're keen to take on any of these suggestions, please submit them as separate pull requests, so they can be evaluated individually. Thanks again. This only works because of community contributions like this.

crschul commented 2 years ago

Using this issue to make my own suggestion for the checkout process:

  1. In Challenge 5 of Episode 8: Creating Publication-Quality Graphics with ggplot2 students are using theme() to edit axis information. This is often an important step when making figures for publication. Along the same lines I suggest we also ask students to turn the background from grey to white.

ggplot(data = gapminder, mapping = aes(x = continent, y = lifeExp, fill = continent)) + geom_boxplot() + facet_wrap(~year) + ylab("Life Expectancy") + theme(axis.title.x=element_blank(), axis.text.x = element_blank(), axis.ticks.x = element_blank(), panel.background = element_rect(fill = "White", colour = "White"), panel.grid.major = element_line(colour = "lightgrey"), panel.grid.minor = element_line(colour = "lightgrey"))

Manipulating plot background and grids are useful without overloading students with customizable options.

  1. I mentioned it in another thread but I think https://www.rstudio.com/resources/cheatsheets/ should have its own tip box at the end of Episode 8 or the start of Episode 12. When I started to code I had several of them printed out as quick reference guides.
jcoliver commented 2 years ago

Thank you for this contribution @crschul . This is probably a bit much to add to challenge 5, which is already asking learners to do two new things. It might be useful though, to introduce the built-in themes, such as theme_bw() or theme_minimal() that create plots with a white background. This would probably need to be a separate challenge.

The link to the cheat sheet resources would be great to include in the ggplot episode.

tiagojp commented 2 years ago

As part of my checkout process, I would like to make contributions to "R for Reproducible Scientific Analysis". I have made suggestions for Creating Publication-Quality Graphics with ggplot2, specifically changed font/size of axis title to differentiate better from axis text.

In addition, I have a made a suggestion to plot country abbreviations (i.e., iso3c) instead of country names, thus the country names have the same length and the plot does not need to be too wide. Also, I think it is a great way to show that labels can be changed during the plotting process.

Plot using country abbreviation (i.e., iso3c format)

ggplot(data = americas, mapping = aes(x = year, y = lifeExp, color=continent)) + geom_line() + facet_wrap( ~ factor(country, levels = c("Argentina", "Bolivia", "Brazil", "Canada", "Chile", "Colombia", "Costa Rica", "Cuba", "Dominican Republic", "Ecuador", "El Salvador", "Guatemala", "Haiti", "Honduras", "Jamaica", "Mexico", "Nicaragua", "Panama", "Paraguay", "Peru", "Puerto Rico", "Trinidad and Tobago", "United States", "Uruguay", "Venezuela"), labels = c("ARG", "BOL", "BRA", "CAN", "CHL", "COL", "CRI", "CUB", "DOM", "ECU", "SLV", "GTM", "HTI", "HND", "JAM", "MEX", "NIC", "PAN", "PRY", "PER", "PRI", "TTO", "URY", "USA", "VEN"))) + theme(axis.title.x = element_text(size = 12, face = "bold")) + # change font and size of x-title theme(axis.title.y = element_text(size = 12, face = "bold")) + # change font and size of y-title theme(strip.text.x = element_text(size = 12, face = "bold")) + # change font and size of facet x-title theme(legend.title = element_text(size = 12, face = "bold")) + # change font and size of legend title labs( x = "Year", # x axis title y = "Life expectancy", # y axis title title = "Figure 1", # main title of figure color = "Continent" # title of legend ) + theme(axis.text.x = element_text(angle = 90, hjust = 1))

jcoliver commented 2 years ago

Thank you, @tiagojp for this contribution. While the addition of the ISO country code labels is a useful means of accommodating long country names, it would be a bit much to add on to the lesson at this point. We appreciate you taking the time to provide this example.