edquant / edh7916

Course materials and website for EDH7916: Contemporary Research in Higher Education
https://edquant.github.io/edh7916/
3 stars 1 forks source link

How to mutate multiple values into 1 value? #32

Closed nszekeres closed 2 years ago

nszekeres commented 4 years ago

Hi @btskinner -

Thanks for the call Monday; learned a lot, despite the chaos :)

Am trying to fix q5 of the assignment, and everything works but 1 line of code... am trying to get all values from 6 to 11 to say 6 so I can compare to values greater than 12.

The greater than 12 part worked, but I'm spinning in circles trying to solve for those numbers 6 to 11... I've tried >5 & <12, I've tried simply turning each number (5 - 11) to 6, one by one... I've tried different "select" commands and using words for the new values instead of numbers and vice versa... I've tried assigning the current data to a new df... everything. Any help? Thanks, Naomi

# truncate data
df_q5 <- df_hs %>%
# mutate data for hs2ps na if neg, =0, >6, >12
 select (stu_id, x1ses, x4evratndclg, x4hs2psmos) %>%
  mutate (x4evratndclg_fixed = ifelse(x4evratndclg < 1, NA, x4evratndclg), 
          x4hs2psmos_fixed = ifelse(x4hs2psmos < 0, NA, x4hs2psmos),
          x4hs2psmos_fixed = ifelse(x4hs2psmos_fixed > 5 & x4hs2psmos_fixed < 12, 6, x4hs2psmos), 
#this line is not working/rendering in the results -- why not?
x4hs2psmos_fixed = ifelse(x4hs2psmos >= 12, 12, x4hs2psmos))
df_q5
count(df_q5, x4hs2psmos_fixed)
p <- ggplot(data = df_q5,
            mapping = aes(x = factor(x4hs2psmos_fixed),
                          y = x1ses,
                          fill = as_factor(x4hs2psmos_fixed))) +
  facet_wrap(~ x4evratndclg, ncol = 1) +
  geom_boxplot()
p
btskinner commented 4 years ago

@nszekeres: almost had the code blocks, but take a look at my changes. If you don't see the grey background like above when you Preview the post, then something isn't quite right.

It's a little difficult to see the issue without also seeing the output, but one thing: you are moving back and forth between your new variable, x4hs2psmos_fixed and the original variable, x4hs2posmos, in the ifelse() statements. Is that what you want or do you want to put x4hs2posmos_fixed in the third argument (ifelse(arg_1, arg_2, arg_3)) after you introduce it? In the line you say doesn't work, you've once again switched back to the original variable.

nszekeres commented 4 years ago

@btskinner - Hi, thanks. Noted on the code blocks... I was trying to get my question out as the monsters were waking -- time was definitely up!

Yes, I tried changing the x4hs2psmos_fixed variable because when I used the original variable x4hs2posmos it didn't work... I was trying to see if I could get R to recognize it somehow - or even throw an error message, which it did not. :(

If I change it:

df_q5 <- df_hs %>%
# mutate data for hs2ps na if neg, =0, >6, >12
  select (stu_id, x1ses, x4evratndclg, x4hs2psmos) %>%
  mutate (x4evratndclg_fixed = ifelse(x4evratndclg < 1, NA, x4evratndclg), 
          x4hs2psmos_fixed = ifelse(x4hs2psmos < 0, NA, x4hs2psmos),
          x4hs2psmos_fixed = ifelse(x4hs2psmos > 5 & x4hs2psmos < 12, 6, x4hs2psmos), #this line is not working/rendering in the results -- why not?
          x4hs2psmos_fixed = ifelse(x4hs2psmos >= 12, 12, x4hs2psmos))
df_q5
count(df_q5, x4hs2psmos_fixed)

The result I get is still:

> count(df_q5, x4hs2psmos_fixed)
# A tibble: 14 x 2
   x4hs2psmos_fixed     n
              <dbl> <int>
 1                0   342
 2                1   356
 3                2  3895
 4                3  5622
 5                4   629
 6                5    84
 7                6    42
 8                7   235
 9                8   198
10                9    55
11               10    37
12               11    39
13               12  1357
14               NA 10612

The issue is that the NA's are getting correctly labeled, the 12's and above are getting correctly labeled, but the 6-11's are not getting handled properly. I have tried all kinds of variations, including non-mutate commands, but somehow cannot get these 6-11's to be managed properly (i.e. all assigned as a 6 for graphing purposes). Not sure what the limitation is in the mutate function?

I have enough of a view to answer the assignment question, but I need to know how to do this for a real dataset... any ideas? Thanks, Naomi

btskinner commented 4 years ago

Take another look at your code, specifically these lines in the mutate() function:

x4hs2psmos_fixed = ifelse(x4hs2psmos < 0, NA, x4hs2psmos),
x4hs2psmos_fixed = ifelse(x4hs2psmos > 5 & x4hs2psmos < 12, 6, x4hs2psmos),
x4hs2psmos_fixed = ifelse(x4hs2psmos >= 12, 12, x4hs2psmos)

See how you first create x4hs2psmos_fixed, but then continue to use x4hs2psmos in subsequent ifelse() statements? With each new line, you are effectively overwriting the work you just did. What you show via count(df_q5, x4hs2psmos_fixed) is the result of the final line of the mutate() statement and only that line.

To build upon what you've done, use x4hs2psmos_fixed in place of x4hs2psmos after the first line. My guess is that that should fix your issue.

btskinner commented 2 years ago

Closing since it's older