Closed RhoBott closed 3 years ago
As for probably break out these examples by commands??? re: stringr, I usually don't think of that package as having "core verbs". I think the subsections with specific verbs are helpful for dplyr because knowing those 6 gets you 90% of the way there in terms of using dplyr well, but our goal with stringr should maybe be more along the lines of teach folks to navigate the 30 `str*` functions that could be helpful. maybe that has nothing to do with headers, though ¯\_(ツ)/¯
Some discussion of dropping the data.frame
vs tbl_df
section in b7e75f3. I'm thumbs up on this if you agree, @shokatl!
I think I'm generally cozy with structure, things-belonging, completeness (save kbotts notes here), but i'm usually partial to longer webpages, so I understand if others feel breaking things up would make this more navigable!
Related-ish thought, after looking at this--probably worth a pass through section header levels and make sure we're using a number of ###
that makes sense. :-)
re: stringr
, point taken + sounds good. It may be useful (esp if these are longer pages) to have some sort of subheading that helps explain what stringr
do -- since for the naive audience, they may/not know to go look at "string" when they want to know "how do i work with character/word data"
another question: order for the dplyr()
verbs? it may/not matter - i wasn't sure if that was an intentional order (most commonly used to least, most complex to least, etc) or not.
I am good with taking out the data.frame
versus tbl_df
bit if you both are.
Yeah, the only slight change I might make to the dplyr verb order is putting arrange()
after group_by()
and summarize()
, because that, to me, is when arrange usually comes in handy anyways. But other than that I think they are in the order that I learned them in... don't remember filter/select order.
I was unsure about where to put summary tables. It could go with frequency tables I guess, but to me kable
and kableExtra
are like "I need a pretty table to show people that's not just R's normal format" and that feels ~sort of~ like data visualization (at least presentation). But if there's another place that's better I'm not attached to it being here.
This is more of a thing possibly worth discussion than a thing I have an issue with, but is web-scraping/rvest
too advanced for these tutorials? I love the way it's presented right now, but I personally come across scraping pretty rarely.
I'm also unsure of how I feel about the example section at the beginning of Wrangling Data. I imagine it being somewhat overwhelming, especially with only some chunks having set echo = FALSE
. If the goal of this section is to clarify what we mean by tidy data, I'd recommend something more abstract... possibly making use of Allison's Horst's and Julie Lowndes' resources about tidy data principles. The raw .jpg
files for that post are available on Allison's stats-illustrations
repo. :-)
If we decide to keep that section, we should probably clarify "you can move from tidy to untidy or and back again using tidyr
."--this isn't always the case if any summarization has happened. (But maybe it's not "tidying" if it's not reversible? ¯\(ツ)/¯)
Also, I'm on board for all of @shokatl's recommendations!
a few small things:
the end of "getting started" and the beginning of "loading data" are very similar. not necessarily a bad thing, just something to note. maybe we can differentiate them more? or maybe the repetition is good.
I agree with @simonpcouch about the example in wrangling data- I think another example of "what is tidy data" "how do I know if my data is tidy or untidy" would be good, and I love the stats illustrations.
a thought I had re: visualizations-- the linegraphs are not very intuitive re: what they are showing. would it be worth it to add a sentence or two after each visualization saying something along the lines of "this is what this graph is showing/the relationship its exploring/what it means" ? or maybe, that's not the point of this, we're just here to help people make the graphs, not tell them what it all means.
overall, I really like this!! I like the structure, I like how it flows, I think it's easy to navigate and find what you're looking for. I think it would be super helpful if I were starting to learn R.
Yes, I agree^ about the tidy data example, it would be good to have a simpler example here.
one note to add to @zolli22 's comment about the graphs in the visualization section: maybe we could pick one graph (or add a small section) to show people how to customize their graphs (labs()
, theme()
, how to pick your own colors)?
@simonpcouch I have similar wonderings in regards to the rvest
section. I love the idea of having some documentation for it on D@R, but I keep going back and forth between whether we should include a small tutorial for it or just include a link to one of the (many) blog posts that have been written about it.
I really like @shokatl 's idea of including an example showing off labs()
and theme()
!
After reading some of @RhoBott 's comments on the "writing up your results" I'm realizing that some of my writing feels a little bit cult-ish(?) as if this is how writing and sharing code has to be done. I think it could be useful to rewrite/reword the sections so that they explain what the advantages could be with using .rmd and .R, but I want to avoid making it sound so "my way or the highway"-ey. I'd love to see another mlabbies pass at this :).
One last thing that I think could be useful is a section on the Data-wrangling page that talks about joins. I feel like I get questions regarding joining data sets frequently, so it could be nice to include some documentation on that!
crew! this is excellent; thank you. i've made some notes + will pick this up tomorrow morning (first on shift = first to bounce ideas: @simonpcouch ). Some quick responses below; again, watch for direction tomorrow a.m.
@joshyam-k , I think the RMD / R discussion is good -- what I was really looking for is more of the "why" (why is it useful, why is it ideal), along with my own belief-system-statement that R scripts can be chock-full-o-text (which is maybe my longstanding love of scripts in the face of the prettier RMDs)
@shokatl I like the idea of adding some demonstrations of how to customize common aspects of a graph (colors, labs()
, theme()
, etc); @zolli22 I'm on the fence on explaining graphs -- it can be useful, and also - we do usually try to leave that interpretation aspect of things to students/faculty (we are the methods folks / we are the mechanics)
I am also up for: a simpler discussion of wrangling/tidying, addition of useful visuals to same, and putting summary tables wherever y'all think is a good proposal of location.
wheeee okay, let's do this! kbott sent me a summary of changes for us to make yesterday evening, so i'll drop that list as a checklist in this comment, and you all should have access to check things off as you work. feel free to pick a few to work through as you so please. :-) feel free to commit directly to main
and make sure to tag this issue by adding (#42)
to your commit message. the process could look something like:
main
under "REMOTE: ORIGIN"changes to make:
data.frame
vs tbl_df
sectionfeel free to ping me on slack if there's anything i should add/remove from this list.
I added a small section on graph customization, but someone should look over it/edit it/add to it, so I'm not gonna check it off the list.
ggmap
” section after mosaic plot section, felt appropriate and felt weird to have just the mosaic plot section and nothing else. This section with mosaic plots and ggmap feels like a “extension packages of ggplot2" section to me, which I think is cool and might hint to people that there’s even more out thereStill to do (I'll be working on this today):
Things currently being worked on:
closing, since D@R R3.0 is up!
Team!
Right now the below is what we have for the structure of the new-and-improved R help pages.
Questions: Is this structure effective in grouping like content together? Is everything where it belongs? Do we have all of the content we need? ... and then we can worry about more structural issues like "long pages w/ anchors" vs "landing pages w/ menus/indexes"
Getting Started with R at Reed
Reed RStudio Server
Installing Libraries
Uploading Data to the Reed Rstudio Server
Jointly Editing Documents
The server is broken / down / having problems
Error occurred during transmission (broken sessions)
Desktop R
Downloading + installing for MacOSX
Downloading + installing for Windows
Writing Your Thesis in
thesisdown
Meet the Palmer Penguins
Loading Data
From a Package
From a .csv (with
readr
)From Excel (with
readxl
)From Google Sheets (with
googlesheets4
)From the Internet (with
rvest
)Wrangling Data
Tidy Data Principles, Reshaping Data, and
tidyr
Example: penguin body mass and
group_by()
Restructuring with pivot_wider()
Restructuring with pivot_longer()
Transforming Data With
dplyr
1. Filtering Rows with
filter()
2. Arranging Rows with
arrange()
3. Selecting Columns with
select()
4. Creating New Columns with
mutate()
5. Summarizing Data with
group_by()
andsummarize()
6. Frequency Tables with
count()
strings with
stringr
probably break out these examples by commands???
Data Frames and
tibble
sPresenting and Visualizing Data
Intro to data visualization with
ggplot2
Scatterplots
Linegraphs
Barplots
Histograms
Boxplots
More Aesthetics and Geoms
Alpha: adjusting transparency
Faceting: small multiples
other commonly-used-geoms-maybe
Making Summary Tables in R
-- maybe revisit this? Does it belong under data visualization?
Writing Up Your Results
-- revisit text, see notes to mlabbies. Also, questions above.
More Resources
-mlabbies, revisit this?
Additional online resources
R package Cheatsheets and Documentation
Textbooks
Hive mind: forums and asking questions
Finding answers in forums
Asking for help online