Open vincenzocoia opened 4 years ago
Definitely important to go over the notion of running code interactively vs. from source. Too hard to explain this at the same time as trying to explain why here::here()
sometimes gives different things between the two.
source()
.EDIT: I should avoid this topic, because it is discussed later in 545B in the "Automation" topic.
I think there might be too much emphasis on having functions print()
things as opposed to just outputting a character vector.
Instead of allowing them to work with any dataset, choose 3 that will definitely work nicely with the questions posed in the project.
If you are invested in a dataset of your own, encourage them to use it, but also describe it so that the teaching team can understand it. Perhaps even get them to put it in a data
folder with a README describing each column -- or at least, the most important ones.
Questions ended up being too specific -- would be more effective to generalise. This milestone was also too demanding for students.
People were copying their knitted md files into the output
folder. Indicate not to do this (it's not reproducible anyway)
Feedback for Assignment 1-B in the private instructor repo: https://github.com/UBC-STAT/stat-545-instructor/issues/24
verbose
option should be stricter. Shouldn't be square()
: "calculating square..." "...done calculating square!". Should be a multi-calculation function.install_github()
:install_github(build_vignettes = TRUE, ref = "0.1.0")
At least for 545B, where topics span 2 days, it may be useful to cut out some instruction on more details, and end with a 10-20 minute presentation on expanding the topic. This would allow me to at least plant the seed for topics that I wish we could cover in this course, like Rmd presentations, bookdown, GitHub pages, etc.
Using /usr/share/dict/words
as a dependency for words.txt
is causing issues for Windows users. Looks like this has consistently been a problem in the past years. I advised the current students to adapt this issue into code, and download words.txt
from here:
words.txt:
Rscript -e "code to download words.txt here"
It probably doesn't even need to be R code-- I'd accept using wget
or curl
for the students savvy with the command-line too, although it is good exposure to RCurl
.
UPDATE: I just checked this year's Makefile, and it looks like there was a commented out line for them Windows users. Maybe we could write extra comments in the Makefile to let the students know to use that line instead if they're on a Windows machine.
This assignment is not Windows-friendly. Should teach remake
instead of make
, and just allude to make
at the end.
strings assignment: there was an article that many people drew on for removing stopwords. It's OK to draw from it, but many people didn't cite it. I should use this as an example of how and when to cite code.
List the Gutenberg project as being a source of freely available books: https://dev.gutenberg.org/
Extensions:
Assignment feedback: there should be a document outlining procedures:
Improving the stat545 content:
Others:
Team leading:
Clarification on Assignment 4:
rm
phony targets in their clean
, so this may also need some more clarificationOverall clarifications:
/var/temp/private/etc
or http://127.0.0.1:XXXX Overall, I think there was quite a bit of 'hand-holding' in 545A, with the checklists etc. It shouldn't be our responsibility if the students miss certain criteria for not reading carefully. I also think that the students should be able to just download our .Rmd
files for assignments, and not need us to create a pull request for each assignment. Although if this is the direction we want to continue on, and if the course coordinator is git-savvy, this could be one of their consistent tasks.
Here's a fun dataset that would be useful to introduce dplyr and ggplot2 -- an analysis of pride parade entries across time: https://github.com/GaytaScience/PrideParades
Yulia recommends ProjectTemplate as an alternative topic to Makefiles.
Starting a thread to put ideas for next year.
Tibble Joins lecture
And, perhaps follow with a
pivot_longer()
-- as an alternative to the clunkybind_cols()
thenbind_rows()
method.General Idea
For the tibble joins lecture today, I did a little segue into a higher-level discussion about where tibble joins often show up in practice. I think it was useful for framing the topic, but it also brought up other aspects of writing a data analysis that's hard to treat as its own topic, such as using
mass[1]
vsunique(mass)
to retrieve island land mass.