Rephrase these sentences (delete text in brackets): "We already know that when we use a function, we need to know what the arguments are. We can learn about the arguments from the documentation, which we can access by typing, [for example, if we want to use] for the mean() function as an example, [we look at the documentation by typing]"
Add the phrase in bold: "Inside the curly braces we've found how many numbers are in the x vector, we've added up all the values in that vector, and we've found the average. In a function, the last line will always be returned; in this case, the function will print (output) the average_of_values."
Is this statement true: "The ifelse() function is a simplified version of if() and can be used when you are returning simple values based on a conditional test."? I've run some fairly complicated code within the and options, so I don't think it is limited to returning simple values. I'd be interested to hear for my own purposes whether you think if() is better set up for complicated situations. ifelse() has always seemed to work well for me, but I'm wondering if I should be using if() more.
At the end of the Conditionals section, add some text to the extent: "if() and ifelse() can be particularly powerful when applied to vectors. In this case, they'll return a new vector showing the results for each row of the data frame." And then give an example of applying this to a vector in a data frame to make a new column. This is how I most frequently apply these functions, and it's very powerful.
You could also show how you can apply mathematical functions to the conditional responses. (For example, I've done this based on a units column: converted ppm to ppb, for example, based on the units indicator.)
Something is missing from the my_averages example under for loops: you're only saving the most recent average, you lose the first two. Shouldn't you have steps like: temp_average <- mean(i), and then: my_averages <- rbind(my_averages, temp_average. This would add the output of each iteration on to the overall vector, so you'd get all three outputs.
I've found "for" loops to be incredibly helpful in producing a set of related plots, in with i iterates through different site names or pollutants, or something like that. You should present this somewhere in the training, maybe under plotting. Or maybe just mention it here and then return to it later. It's a really powerful tool.
In the example under apply(), explain what na.rm is.
Could you expand the apply() discussion? This is something I've never really understood, and it comes up a lot in answers to questions on Stack Overflow. It seems worth going into in greater detail and showing some examples with data frames as inputs, more in line with the way we're likely to use it. Using examples with data frames as input, where you're adding another vector to the data frame as the output, for example. My recollection is that the topic didn't come up later in the trainings. It would be worth spending more time on.