RhoBott / data_at_reed

A repository to manage the D@R update!
2 stars 0 forks source link

D@R R pages:: final content check + structure! #42

Closed RhoBott closed 3 years ago

RhoBott commented 3 years ago

Team!

Right now the below is what we have for the structure of the new-and-improved R help pages.

Questions: Is this structure effective in grouping like content together? Is everything where it belongs? Do we have all of the content we need? ... and then we can worry about more structural issues like "long pages w/ anchors" vs "landing pages w/ menus/indexes"

Getting Started with R at Reed

Reed RStudio Server

Installing Libraries

Uploading Data to the Reed Rstudio Server

Jointly Editing Documents

The server is broken / down / having problems

Error occurred during transmission (broken sessions)

Desktop R

Downloading + installing for MacOSX

Downloading + installing for Windows

Writing Your Thesis in thesisdown

Meet the Palmer Penguins

Loading Data

From a Package

From a .csv (with readr)

From Excel (with readxl)

From Google Sheets (with googlesheets4)

From the Internet (with rvest)

Wrangling Data

Tidy Data Principles, Reshaping Data, and tidyr

Example: penguin body mass and group_by()

Restructuring with pivot_wider()

Restructuring with pivot_longer()

Transforming Data With dplyr

1. Filtering Rows with filter()

2. Arranging Rows with arrange()

3. Selecting Columns with select()

4. Creating New Columns with mutate()

5. Summarizing Data with group_by() and summarize()

6. Frequency Tables with count()

strings with stringr

probably break out these examples by commands???

Data Frames and tibbles

Presenting and Visualizing Data

Intro to data visualization with ggplot2

Scatterplots

Linegraphs

Barplots

Histograms

Boxplots

More Aesthetics and Geoms

Alpha: adjusting transparency
Faceting: small multiples
other commonly-used-geoms-maybe

More Resources

-mlabbies, revisit this?

Additional online resources

R package Cheatsheets and Documentation

Textbooks

Hive mind: forums and asking questions

Finding answers in forums

Asking for help online

simonpcouch commented 3 years ago

As for probably break out these examples by commands??? re: stringr, I usually don't think of that package as having "core verbs". I think the subsections with specific verbs are helpful for dplyr because knowing those 6 gets you 90% of the way there in terms of using dplyr well, but our goal with stringr should maybe be more along the lines of teach folks to navigate the 30 `str*` functions that could be helpful. maybe that has nothing to do with headers, though ¯\_(ツ)

Some discussion of dropping the data.frame vs tbl_df section in b7e75f3. I'm thumbs up on this if you agree, @shokatl!

I think I'm generally cozy with structure, things-belonging, completeness (save kbotts notes here), but i'm usually partial to longer webpages, so I understand if others feel breaking things up would make this more navigable!

Related-ish thought, after looking at this--probably worth a pass through section header levels and make sure we're using a number of ### that makes sense. :-)

RhoBott commented 3 years ago

re: stringr, point taken + sounds good. It may be useful (esp if these are longer pages) to have some sort of subheading that helps explain what stringr do -- since for the naive audience, they may/not know to go look at "string" when they want to know "how do i work with character/word data"

another question: order for the dplyr() verbs? it may/not matter - i wasn't sure if that was an intentional order (most commonly used to least, most complex to least, etc) or not.

shokatl commented 3 years ago

I am good with taking out the data.frame versus tbl_df bit if you both are.

Yeah, the only slight change I might make to the dplyr verb order is putting arrange() after group_by() and summarize(), because that, to me, is when arrange usually comes in handy anyways. But other than that I think they are in the order that I learned them in... don't remember filter/select order.

I was unsure about where to put summary tables. It could go with frequency tables I guess, but to me kable and kableExtra are like "I need a pretty table to show people that's not just R's normal format" and that feels ~sort of~ like data visualization (at least presentation). But if there's another place that's better I'm not attached to it being here.

simonpcouch commented 3 years ago

This is more of a thing possibly worth discussion than a thing I have an issue with, but is web-scraping/rvest too advanced for these tutorials? I love the way it's presented right now, but I personally come across scraping pretty rarely.

I'm also unsure of how I feel about the example section at the beginning of Wrangling Data. I imagine it being somewhat overwhelming, especially with only some chunks having set echo = FALSE. If the goal of this section is to clarify what we mean by tidy data, I'd recommend something more abstract... possibly making use of Allison's Horst's and Julie Lowndes' resources about tidy data principles. The raw .jpg files for that post are available on Allison's stats-illustrations repo. :-)

If we decide to keep that section, we should probably clarify "you can move from tidy to untidy or and back again using tidyr."--this isn't always the case if any summarization has happened. (But maybe it's not "tidying" if it's not reversible? ¯\(ツ)/¯)

Also, I'm on board for all of @shokatl's recommendations!

zolli22 commented 3 years ago

a few small things:

the end of "getting started" and the beginning of "loading data" are very similar. not necessarily a bad thing, just something to note. maybe we can differentiate them more? or maybe the repetition is good.

I agree with @simonpcouch about the example in wrangling data- I think another example of "what is tidy data" "how do I know if my data is tidy or untidy" would be good, and I love the stats illustrations.

a thought I had re: visualizations-- the linegraphs are not very intuitive re: what they are showing. would it be worth it to add a sentence or two after each visualization saying something along the lines of "this is what this graph is showing/the relationship its exploring/what it means" ? or maybe, that's not the point of this, we're just here to help people make the graphs, not tell them what it all means.

overall, I really like this!! I like the structure, I like how it flows, I think it's easy to navigate and find what you're looking for. I think it would be super helpful if I were starting to learn R.

shokatl commented 3 years ago

Yes, I agree^ about the tidy data example, it would be good to have a simpler example here.

one note to add to @zolli22 's comment about the graphs in the visualization section: maybe we could pick one graph (or add a small section) to show people how to customize their graphs (labs(), theme(), how to pick your own colors)?

joshyam-k commented 3 years ago

@simonpcouch I have similar wonderings in regards to the rvest section. I love the idea of having some documentation for it on D@R, but I keep going back and forth between whether we should include a small tutorial for it or just include a link to one of the (many) blog posts that have been written about it.

I really like @shokatl 's idea of including an example showing off labs() and theme()!

After reading some of @RhoBott 's comments on the "writing up your results" I'm realizing that some of my writing feels a little bit cult-ish(?) as if this is how writing and sharing code has to be done. I think it could be useful to rewrite/reword the sections so that they explain what the advantages could be with using .rmd and .R, but I want to avoid making it sound so "my way or the highway"-ey. I'd love to see another mlabbies pass at this :).

One last thing that I think could be useful is a section on the Data-wrangling page that talks about joins. I feel like I get questions regarding joining data sets frequently, so it could be nice to include some documentation on that!

RhoBott commented 3 years ago

crew! this is excellent; thank you. i've made some notes + will pick this up tomorrow morning (first on shift = first to bounce ideas: @simonpcouch ). Some quick responses below; again, watch for direction tomorrow a.m.

@joshyam-k , I think the RMD / R discussion is good -- what I was really looking for is more of the "why" (why is it useful, why is it ideal), along with my own belief-system-statement that R scripts can be chock-full-o-text (which is maybe my longstanding love of scripts in the face of the prettier RMDs)

@shokatl I like the idea of adding some demonstrations of how to customize common aspects of a graph (colors, labs(), theme(), etc); @zolli22 I'm on the fence on explaining graphs -- it can be useful, and also - we do usually try to leave that interpretation aspect of things to students/faculty (we are the methods folks / we are the mechanics)

I am also up for: a simpler discussion of wrangling/tidying, addition of useful visuals to same, and putting summary tables wherever y'all think is a good proposal of location.

simonpcouch commented 3 years ago

wheeee okay, let's do this! kbott sent me a summary of changes for us to make yesterday evening, so i'll drop that list as a checklist in this comment, and you all should have access to check things off as you work. feel free to pick a few to work through as you so please. :-) feel free to commit directly to main and make sure to tag this issue by adding (#42) to your commit message. the process could look something like:

changes to make:

feel free to ping me on slack if there's anything i should add/remove from this list.

zolli22 commented 3 years ago

I added a small section on graph customization, but someone should look over it/edit it/add to it, so I'm not gonna check it off the list.

shokatl commented 3 years ago

Still to do (I'll be working on this today):

shokatl commented 3 years ago

Things currently being worked on:

shokatl commented 3 years ago

Here is the outline of what links will look like!

RhoBott commented 3 years ago

closing, since D@R R3.0 is up!