RhoBott commented 3 years ago

Team!

Right now the below is what we have for the structure of the new-and-improved R help pages.

Questions: Is this structure effective in grouping like content together? Is everything where it belongs? Do we have all of the content we need? ... and then we can worry about more structural issues like "long pages w/ anchors" vs "landing pages w/ menus/indexes"

Getting Started with R at Reed

Reed RStudio Server

Installing Libraries

Uploading Data to the Reed Rstudio Server

Jointly Editing Documents

The server is broken / down / having problems

Error occurred during transmission (broken sessions)

Desktop R

Downloading + installing for MacOSX

Downloading + installing for Windows

Writing Your Thesis in `thesisdown`

Meet the Palmer Penguins

Loading Data

From a Package

From a .csv (with `readr`)

From Excel (with `readxl`)

From Google Sheets (with `googlesheets4`)

From the Internet (with `rvest`)

Wrangling Data

Tidy Data Principles, Reshaping Data, and `tidyr`

Example: penguin body mass and `group_by()`

Restructuring with pivot_wider()

Restructuring with pivot_longer()

Transforming Data With `dplyr`

1. Filtering Rows with `filter()`

2. Arranging Rows with `arrange()`

3. Selecting Columns with `select()`

4. Creating New Columns with `mutate()`

5. Summarizing Data with `group_by()` and `summarize()`

6. Frequency Tables with `count()`

strings with `stringr`

probably break out these examples by commands???

Data Frames and `tibble`s

Presenting and Visualizing Data

Intro to data visualization with `ggplot2`

Scatterplots

Linegraphs

Barplots

Histograms

Boxplots

More Aesthetics and Geoms

Alpha: adjusting transparency

Faceting: small multiples

other commonly-used-geoms-maybe

consider adding geom_smooth() here
consider geom_jitter() and why you would/not use it
consider linking to info on mosaic plots
how to combine geometries
error bars
Making Summary Tables in R

-- maybe revisit this? Does it belong under data visualization?

Writing Up Your Results

-- revisit text, see notes to mlabbies. Also, questions above.

More Resources

-mlabbies, revisit this?

Additional online resources

R package Cheatsheets and Documentation

Textbooks

Hive mind: forums and asking questions

Finding answers in forums

Asking for help online

simonpcouch commented 3 years ago

As for probably break out these examples by commands??? re: stringr, I usually don't think of that package as having "core verbs". I think the subsections with specific verbs are helpful for dplyr because knowing those 6 gets you 90% of the way there in terms of using dplyr well, but our goal with stringr should maybe be more along the lines of teach folks to navigate the 30 `str*` functions that could be helpful. maybe that has nothing to do with headers, though ¯\_(ツ)/¯

Some discussion of dropping the data.frame vs tbl_df section in b7e75f3. I'm thumbs up on this if you agree, @shokatl!

I think I'm generally cozy with structure, things-belonging, completeness (save kbotts notes here), but i'm usually partial to longer webpages, so I understand if others feel breaking things up would make this more navigable!

Related-ish thought, after looking at this--probably worth a pass through section header levels and make sure we're using a number of ### that makes sense. :-)

RhoBott commented 3 years ago

re: stringr, point taken + sounds good. It may be useful (esp if these are longer pages) to have some sort of subheading that helps explain what stringr do -- since for the naive audience, they may/not know to go look at "string" when they want to know "how do i work with character/word data"

another question: order for the dplyr() verbs? it may/not matter - i wasn't sure if that was an intentional order (most commonly used to least, most complex to least, etc) or not.

shokatl commented 3 years ago

I am good with taking out the data.frame versus tbl_df bit if you both are.

Yeah, the only slight change I might make to the dplyr verb order is putting arrange() after group_by() and summarize(), because that, to me, is when arrange usually comes in handy anyways. But other than that I think they are in the order that I learned them in... don't remember filter/select order.

I was unsure about where to put summary tables. It could go with frequency tables I guess, but to me kable and kableExtra are like "I need a pretty table to show people that's not just R's normal format" and that feels ~sort of~ like data visualization (at least presentation). But if there's another place that's better I'm not attached to it being here.

simonpcouch commented 3 years ago

This is more of a thing possibly worth discussion than a thing I have an issue with, but is web-scraping/rvest too advanced for these tutorials? I love the way it's presented right now, but I personally come across scraping pretty rarely.

I'm also unsure of how I feel about the example section at the beginning of Wrangling Data. I imagine it being somewhat overwhelming, especially with only some chunks having set echo = FALSE. If the goal of this section is to clarify what we mean by tidy data, I'd recommend something more abstract... possibly making use of Allison's Horst's and Julie Lowndes' resources about tidy data principles. The raw .jpg files for that post are available on Allison's stats-illustrations repo. :-)

If we decide to keep that section, we should probably clarify "you can move from tidy to untidy or and back again using tidyr."--this isn't always the case if any summarization has happened. (But maybe it's not "tidying" if it's not reversible? ¯\(ツ)/¯)

Also, I'm on board for all of @shokatl's recommendations!

zolli22 commented 3 years ago

a few small things:

the end of "getting started" and the beginning of "loading data" are very similar. not necessarily a bad thing, just something to note. maybe we can differentiate them more? or maybe the repetition is good.

I agree with @simonpcouch about the example in wrangling data- I think another example of "what is tidy data" "how do I know if my data is tidy or untidy" would be good, and I love the stats illustrations.

a thought I had re: visualizations-- the linegraphs are not very intuitive re: what they are showing. would it be worth it to add a sentence or two after each visualization saying something along the lines of "this is what this graph is showing/the relationship its exploring/what it means" ? or maybe, that's not the point of this, we're just here to help people make the graphs, not tell them what it all means.

overall, I really like this!! I like the structure, I like how it flows, I think it's easy to navigate and find what you're looking for. I think it would be super helpful if I were starting to learn R.

shokatl commented 3 years ago

Yes, I agree^ about the tidy data example, it would be good to have a simpler example here.

one note to add to @zolli22 's comment about the graphs in the visualization section: maybe we could pick one graph (or add a small section) to show people how to customize their graphs (labs(), theme(), how to pick your own colors)?

joshyam-k commented 3 years ago

@simonpcouch I have similar wonderings in regards to the rvest section. I love the idea of having some documentation for it on D@R, but I keep going back and forth between whether we should include a small tutorial for it or just include a link to one of the (many) blog posts that have been written about it.

I really like @shokatl 's idea of including an example showing off labs() and theme()!

After reading some of @RhoBott 's comments on the "writing up your results" I'm realizing that some of my writing feels a little bit cult-ish(?) as if this is how writing and sharing code has to be done. I think it could be useful to rewrite/reword the sections so that they explain what the advantages could be with using .rmd and .R, but I want to avoid making it sound so "my way or the highway"-ey. I'd love to see another mlabbies pass at this :).

One last thing that I think could be useful is a section on the Data-wrangling page that talks about joins. I feel like I get questions regarding joining data sets frequently, so it could be nice to include some documentation on that!

RhoBott commented 3 years ago

crew! this is excellent; thank you. i've made some notes + will pick this up tomorrow morning (first on shift = first to bounce ideas: @simonpcouch ). Some quick responses below; again, watch for direction tomorrow a.m.

@joshyam-k , I think the RMD / R discussion is good -- what I was really looking for is more of the "why" (why is it useful, why is it ideal), along with my own belief-system-statement that R scripts can be chock-full-o-text (which is maybe my longstanding love of scripts in the face of the prettier RMDs)

@shokatl I like the idea of adding some demonstrations of how to customize common aspects of a graph (colors, labs(), theme(), etc); @zolli22 I'm on the fence on explaining graphs -- it can be useful, and also - we do usually try to leave that interpretation aspect of things to students/faculty (we are the methods folks / we are the mechanics)

I am also up for: a simpler discussion of wrangling/tidying, addition of useful visuals to same, and putting summary tables wherever y'all think is a good proposal of location.

simonpcouch commented 3 years ago

wheeee okay, let's do this! kbott sent me a summary of changes for us to make yesterday evening, so i'll drop that list as a checklist in this comment, and you all should have access to check things off as you work. feel free to pick a few to work through as you so please. :-) feel free to commit directly to main and make sure to tag this issue by adding (#42) to your commit message. the process could look something like:

in the Git panel on RStudio, switch to main under "REMOTE: ORIGIN"
pull (blue down arrow)
make your changes
commit + push :-)

changes to make:

[x] consistency in how many #### are used for sections
[x] drop data.frame vs tbl_df section
[x] doublecheck dplyr verb order
[x] where do summary tables go?
[x] possibly relocate rvest
[x] wrangling data / tidy section; clarify + rewrite / etc
[x] graph customization (re: leila's comment)
[x] rewrite of RMD / R - results presentation
[x] add section on joins
[ ] "other geoms?" section, add: geom_smooth, geom_jitter, moasic plots, ?waffle plots? ... [others]

feel free to ping me on slack if there's anything i should add/remove from this list.

zolli22 commented 3 years ago

I added a small section on graph customization, but someone should look over it/edit it/add to it, so I'm not gonna check it off the list.

shokatl commented 3 years ago

Changed RMD/R Script section Restructured some things so that all of the headings are one of three levels (#, ###, or ####), previously there were a few sections that were #####
Took out the numbers in front of the dplyr verbs (they don’t necessarily need to happen in this order, and it was the only section that had numbers in the section headings which felt out of place)
Standardized capitalization for headings (only page headings (Wrangling Data, More Resources) have each word capitalized, otherwise it’s sentence case)
Added “Spatial data with ggmap” section after mosaic plot section, felt appropriate and felt weird to have just the mosaic plot section and nothing else. This section with mosaic plots and ggmap feels like a “extension packages of ggplot2" section to me, which I think is cool and might hint to people that there’s even more out there

Still to do (I'll be working on this today):

[x] Continue to finalize "other geoms" or "more with ggplot2" section (violin plots, waffle plots, notched boxplots, finalizing descriptions in this section)
[x] finalize R/RMD section
[x] Add a sentence maybe using page anchors to refer people to plots in visualization section where we show how to do color/shape/fill/line type
[x] Break into tiny sections

shokatl commented 3 years ago

Things currently being worked on:

[x] Combining filter/select into one section "subsetting data", using the word "subset" in place of either filter or select
[x] Changing language in titles of the sections about stringr, more aesthetics, and more geoms to be more intuitive/understandable to newcomers (same with "more with ggplot2" section)
[x] Final edits on RMD/R Script section
[x] Put together an outline of what the links will look like on the home page
[x] Knit all docs and make sure things are printing as they should, not showing errors, etc

shokatl commented 3 years ago

Here is the outline of what links will look like!

RhoBott commented 3 years ago

closing, since D@R R3.0 is up!

RhoBott / data_at_reed

D@R R pages:: final content check + structure! #42

Getting Started with R at Reed

Reed RStudio Server

Installing Libraries

Uploading Data to the Reed Rstudio Server

Jointly Editing Documents

The server is broken / down / having problems

Error occurred during transmission (broken sessions)

Desktop R

Downloading + installing for MacOSX

Downloading + installing for Windows

Writing Your Thesis in thesisdown

Meet the Palmer Penguins

Loading Data

From a Package

From a .csv (with readr)

From Excel (with readxl)

From Google Sheets (with googlesheets4)

From the Internet (with rvest)

Wrangling Data

Tidy Data Principles, Reshaping Data, and tidyr

Example: penguin body mass and group_by()

Restructuring with pivot_wider()

Restructuring with pivot_longer()

Transforming Data With dplyr

1. Filtering Rows with filter()

2. Arranging Rows with arrange()

3. Selecting Columns with select()

4. Creating New Columns with mutate()

5. Summarizing Data with group_by() and summarize()

6. Frequency Tables with count()

strings with stringr

probably break out these examples by commands???

Data Frames and tibbles

Presenting and Visualizing Data

Intro to data visualization with ggplot2

Scatterplots

Linegraphs

Barplots

Histograms

Boxplots

More Aesthetics and Geoms

Alpha: adjusting transparency

Faceting: small multiples

other commonly-used-geoms-maybe

Making Summary Tables in R

Writing Up Your Results

More Resources

Additional online resources

R package Cheatsheets and Documentation

Textbooks

Hive mind: forums and asking questions

Finding answers in forums

Asking for help online

Writing Your Thesis in `thesisdown`

From a .csv (with `readr`)

From Excel (with `readxl`)

From Google Sheets (with `googlesheets4`)

From the Internet (with `rvest`)

Tidy Data Principles, Reshaping Data, and `tidyr`

Example: penguin body mass and `group_by()`

Transforming Data With `dplyr`

1. Filtering Rows with `filter()`

2. Arranging Rows with `arrange()`

3. Selecting Columns with `select()`

4. Creating New Columns with `mutate()`

5. Summarizing Data with `group_by()` and `summarize()`

6. Frequency Tables with `count()`

strings with `stringr`

Data Frames and `tibble`s

Intro to data visualization with `ggplot2`