Closed beanumber closed 10 years ago
For example, Lab 4A contains three for() loops. Are these necessary? Or can they be replaced with do() loops?
Lab 4B and Lab 6 also contain a for() loops.
In regards to introducing do loops in lab 4A:
I think we should stick with either for loops or do loops; too many options will just confuse the student. Since we're doing a mosaic version of the labs, do loops are probably the way to go.
For loops are nice because everything is laid out for you. A do loop packs a lot of code into one statement. First off, students tend to struggle with function chaining and you can't break things up into separate variables in a do loop. And secondly, the idea that the do loop is smart enough to put each thing into the next slot in a data frame threw me off (because with a for, you have to specify the index with a counter. But where's the do's counter?).
It might be good to restructure the introduction to do loops like so:
for
loop"), introduce the problem of iteration by showing how tedious it would be to type 5000 lines of sampling.head(sample_means50)
so students can see the results for themselves.A more specific suggestion:
4A explains how the for loop looks like "unrolled" with this code:
samp <- sample(area, 50)
sample_means50[1] <- mean(samp)
# ...repeat this 5000 times
Because this code isn't in a do, it has the luxury of being able to split taking the sample and putting it into the data frame into two lines. However, it doesn't look as similar to the do loop. I'd recommend changing the "unrolled" code to this:
sample_means50[1] <- mean(sample(area, 50)) # consolidated
# ...repeat this 5000 times
I think that makes the jump to the do loop more clear.
I feel like the comments above are trying for force do() to be a for loop instead of appreciating do() for what it is. When I teach with do(), I typically use an outline like this:
I do agree with using head()
to take a look.
> mean(~age, data=HELPrct)
[1] 35.65342
> mean(~age, data=resample(HELPrct))
[1] 35.93377
> mean(~age, data=resample(HELPrct))
[1] 35.9404
> Bootstrap <- do(1000) * mean(~age, data=resample(HELPrct))
> head(Bootstrap,3)
result
1 35.55188
2 35.96026
3 35.85872
If you want to see many more examples, see Lock5 with R
Another way of doing this (which has advantages in some settings) is to use do()
with a smaller number of iterations first. For example:
> do(3) * lm( age ~ shuffle(sex), data=HELPrct)
Intercept sexmale sigma r.squared
1 35.19626 0.5985360 7.714603 0.001089595
2 36.13084 -0.6250608 7.714222 0.001188308
3 36.12150 -0.6128248 7.714400 0.001142240
PS. My students are never confused about each row corresponding to an iteration since that's what I tell them do()
does.
I agree with Randy -- but one question for us to ponder is how far from the "official" OI labs do we want to allow these to drift? This particular lab may be the most difficult, since the fundamental approach to iteration is different.
I'm not in close communication with the OI folks, and my opinion might change if I were. But I would say make that labs as good as they can be and not fetter them unnecessarily. If there are things in the labs that don't make sense to cover when you use mosaic, let them go. If there are other things that could or should be added, add them in. Students are not going to see both sets of labs, so they won't be distracted by any differences.
Agreed. This lab may be one that is ripe for outright replacement.
In any case, the latest commits have removed all for() loops, so I am closing this issue.
Most of these conversions should be straightforward, but a few of them may be tricky.