cssearcy / AYS-R-Coding-SPR-2020

Coding in R for Policy Analytics
https://cssearcy.github.io/AYS-R-Coding-SPR-2020/
3 stars 3 forks source link

Lab 03, Question 7 (Part 1) #13

Open adrianc09 opened 4 years ago

adrianc09 commented 4 years ago

Hi everyone! Our book talks about highlighting three specific outliers, but I'm having some trouble emphasizing the 1924 and 2000 points on question 7. Maybe it's because we have to emphasize multiple points that fit into the years 1924 and 2000. I may be forgetting something that we learned previously. If anyone could offer a hint of some sort that'd be great!

lecy commented 4 years ago

Hi Adrian -

It's helpful if you can provide a little more context. You can copy and paste images from the text into these discussions.

Your current code is also welcome so we can see where you are going wrong. Remember to use "fences" ``` around the code so it formats correctly. The "preview" tab above ☝ is helpful for checking formatting before posting.

# fences example
plot( x, y )

Jesse

adrianc09 commented 4 years ago

Thank you for getting back so soon Jesse! image This is the image I have now, and this is one of the many codes I've tried:

points( Teams$year [1924 & 2000],
        col="firebrick", cex=2, lwd=2 ).

I guess my issue is actually identifying the points I want to highlight.

lecy commented 4 years ago

Note the subset operator [] accepts TRUE or FALSE:

x <- c("A","B","C")
x[ c(FALSE,TRUE,FALSE) ]
[1] "B"

To select elements you need to use a logical statement inside a subset operator:

x[ x == "A" | x == "C" ]
# same as x[ c( TRUE, FALSE, TRUE ) ]
[1] "A" "C"

You can also reference by position:

x[ c( 1, 3 ) ]
[1] "A" "C"

Try this by itself - what are you getting here?

Teams$year [1924 & 2000 ]
adrianc09 commented 4 years ago

When I do the logical statement inside a subset operator by itself, my graph stays the same and the console writes out the following:

   [1] 1871 1871 1871 1871 1871 1871 1871 1871
   [9] 1871 1872 1872 1872 1872 1872 1872 1872
  [17] 1872 1872 1872 1872 1873 1873 1873 1873... 
and goes on until I receive this message:
"[ reached getOption("max.print") -- omitted 1925 entries ]".

I'm trying to see if I can write the logical statement into another way that can help me emphasize my subset of data on my graph. Any ideas that can lead me in the right direction? Thanks again!

lecy commented 4 years ago

I believe the problem is you don't have a complete logical statement inside the subset. The logical statement should return TRUE or FALSE and should have the form:

variable  logical-operator  values-of-interest
x > 10 

The reason you don't reference values directly is because you often select one variable based upon criteria from another.

x[ > 10 ]   # incorrect
x[ x > 10 ]  # correct 

For example:

id <- paste0( "id-100", 1:4 )
group <- c("treatment","control","treatment","control")
gender <- c("male","male","female","female")

id
[1] "id-1001" "id-1002" "id-1003" "id-1004"

# asking for the ID directly
id[ id == "id-1002" ]
[1] "id-1002"

# asking for the ID of study participants that meet criteria
 id[ group == "treatment" ]
[1] "id-1001" "id-1003"

id[ group == "treatment" & gender == "male" ]
[1] "id-1001"
jamisoncrawford commented 4 years ago

Thanks for your help Dr. @lecy!

@adrianc09 were you able to resolve this?

adrianc09 commented 4 years ago

@jamisoncrawford Thanks to Dr. @lecy's help, I was able to highlight all points in 1924. However, I'm trying to figure out how my code can incorporate the points in 2000 simultaneously. This is the code I used to highlight:

these.year <- Teams$yearID == 1924 & 2000 points(year[ these.year ], ave.so[ these.year ], pch=17, cex=3, col="firebrick" ) And this is what my graph looks like so far: image

I'm trying to figure out what I can change to highlight both groups. Thank you very much for your help, Dr. @lecy!

adrianc09 commented 4 years ago

I actually just went back to read Dr. @lecy's responses one more time and was able to figure out my error; both groups are highlighted now!

jamisoncrawford commented 4 years ago

@adrianc09 awesome!

Sorry I was quite swamped earlier in the week - I tend to be more responsive, especially to posts on GitHub. (It's tied to my personal email so I get mobile notifications, unlike my GSU email).

So one thing to add is that you need to change the arguments in your visualization functions so that they resemble the NYT graphic. E.g. pch = and cex = in your points() call, placing the y-axis on the left-hand side, etc. That's really part that stretches you in terms of learning more about these functions and how to modify them!

The video walkthrough does a decent job walking through these:

https://youtu.be/unFbaAhgF2E