I love the intentionally “sketchy” illustrations in the Plan phase of how to reason about your data and output. This is so critical. To this day, I do a lot of paper-and-pen work for analysis. I wonder (hope?) if this will reappear across chapters
I love the proposal for good code documentation with the standardized header
The part about asking for help is important but felt misplaced in the ‘Simulate’ section. I was very engaged in the narrative and this knocked me out of the flow. Some ideas:
Move the section to the end of the chapter?
Put information on making a GitHub account in an appendix?
Should ‘Simulate’ also include simulating the output (the plot) to make sure the assumptions are correct about what was in the ‘Plan’? Students may not fully understand how they benefited just by simulating in the Australian elections example
Should ‘Share’ discuss who the audience is for an analysis? The narrative for Australian elections does a good job talking about what is done, but depending on the audience I might order things differently. For example, if I’m a data journalist for the Economist, my readers care more about the outcome than the approach so I would move the part about how I got and processed the data to the end. (This may be wrong for the academic world, though! I notice in the Toronto homelessness example that you mention this is the order an abstract goes in)
Similar to the part about “asking for help”, the part about citation() feels important but again knocks me out of the reading flow. I wonder if when the book is typeset if these asides could be in a box or a different color font or something to acknowledge that they are apart from the core ‘Simulate’ concepts?
I love to see data validation illustrated in the third example
Generally, it’s excellent to see real-world data examples and hear an author’s “thought process” in working with such data. I think this is a very useful chapter.
Technology
What are pros/cons of doing “Simulate” in Quarto versus an R script? Will this part be part of the final story? Otherwise, there could be some merit to helping students not become too notebook dependent and be comfortable in either format
I’m personally against “loading the whole tidyverse”. I think it’s helpful for students to build intuition for which types of functions live where
Again, just personal preference, but I’d tend against ever having an install.packages() call in a R script or Quarto doc since that’s likely never the best time/place to do system set-up
Students might be confused about saving a Gist as an R script when they are working in a Qmd or how much of the body of the Qmd they are supposed to include
Platform Specific
There’s always some risk to mentioning specific software/platforms that may not exist in the future (e.g. RStudio Cloud free tier). Would you want to hedge your bets by also mentioning options to download RStudio IDE, use GitHub Codespaces, etc.?
I think the examples do a great job showing a good workflow along with good analysis. In the workflow piece, I wonder if a few smaller tips and tricks merit a footnote?
Adding ---- at the end of a code comment (e.g. ### Preface ####) to create outline in RStudio and easily navigate doc
Shortcut for multiline comments (e.g. when commenting out package installs)
Mentions running with Green arrow. Use keyboard shortcut?
Chapter 2
Theory
Technology
install.packages()
call in a R script or Quarto doc since that’s likely never the best time/place to do system set-upPlatform Specific
----
at the end of a code comment (e.g. ### Preface ####) to create outline in RStudio and easily navigate doc