Open voznyuky opened 2 years ago
Does d contain title?
It does not. Need to check my codes to see where I messed up.
If I run code_titles(d) then it does show title but not if I just run d
If I do this: d <- code_titles(d) Then it works, but my total on step 7 is: [1] 2156
It's because of this step:
filter( title != "" )
Which is correct. You will drop cases that do not have titles because they are university administration or staff and do not belong in the faculty salary report.
Ah! Thank you for the help!
You do not want to drop data in steps 2-3 during the merge because you are throwing out good data at that point just because a name was not in the names database or because the first name parser did not code the name correctly.
It's fine to filter data later on to eliminate observations that are not part of the study.
The main distinction is you are making the choice to filter data in the latter case, whereas you are likely dropping observations without realizing it in the former case.
Hello, I also got stuck at this step. I followed the same steps and got the same original error:
d2 <-
d %>%
filter( title != "" & ! is.na(title) ) %>%
filter( Department.Description %in% academic.units ) %>%
arrange( Department.Description, title )
Error: Problem with `filter()` input `..1`.
Input `..1` is `title != "" & !is.na(title)`.
x comparison (2) is possible only for atomic and list types
I understand title is not a column in d and tried using the code_titles function but am getting another error:
d2 <-
code_titles(d) %>%
filter( title != "" & ! is.na(title) ) %>%
filter( Department.Description %in% academic.units ) %>%
arrange( Department.Description, title )
Error in UseMethod("filter") :
no applicable method for 'filter' applied to an object of class "factor"
I suspect I have been staring at my screen too long and am missing an obvious mistake... I would appreciate a push in the right direction!
@bbmoren2 what does your code_titles() function look like?
You might find this thread helpful: https://github.com/Watts-College/cpp-527-fall-2021/issues/68#issuecomment-939148158
I suspect you are doing the same thing that Asia was - sending a data frame to the function and returning the title only. Which is fine, but you then need to structure your data flow as follows:
d$title <- code_titles(d)
d2 <-
d %>%
filter( title != "" & ! is.na(title) ) %>%
filter( Department.Description %in% academic.units ) %>%
arrange( Department.Description, title )
It's a little more elegant to return the full data frame:
function( d )
{
...
d$title <- factor( title )
return( d )
}
d <- code_titles( d )
Then this would work (pipes are sending data frames forward at each step, not individual factors):
d2 <-
d %>%
code_titles() %>%
filter( title != "" & ! is.na(title) ) %>%
filter( Department.Description %in% academic.units ) %>%
arrange( Department.Description, title )
Or this:
d <- code_titles( d )
d2 <-
d %>%
filter( title != "" & ! is.na(title) ) %>%
filter( Department.Description %in% academic.units ) %>%
arrange( Department.Description, title )
Yup! That is exactly what I was doing.
Thank you for the push!!
@lecy I updated all my vectors to line up and flow correctly (d). All the steps work correctly, but when running step 7: d2 <- d %>% filter( title != "" ) %>% filter( Department.Description %in% academic.units ) %>% arrange( Department.Description, title )
nrow( d2 )
I get this message: Error: Problem with
filter()
input..1
. ℹ Input..1
istitle != ""
. x comparison (2) is possible only for atomic and list types Runrlang::last_error()
to see where the error occurred.I did some research and some were saying it might be a dplyr package, but it's definitely part of my library.
Any idea why this is not running correctly?