Watts-College / cpp-527-fall-2021

A course shell for CPP 527 Foundations of Data Science II
https://watts-college.github.io/cpp-527-fall-2021/
2 stars 6 forks source link

Final Project Step 9 Error: # operator is invalid for atomic vectors #59

Open mtwelker opened 2 years ago

mtwelker commented 2 years ago

I was able to complete steps 1-8 successfully, producing the same output that is shown in the instructions. For Step 9, I copied the code you provided and pasted it in a code chunk that starts with {r, fig.height=7, fig.width=10} The only edit I made to the provided code was to change create_table to create_salary_table, since that's the name of my function:

d2 %>%  
  filter( Department.Description == "Psychology" ) %>% 
  create_salary_table( ) %>%  
  build_graph( unit="Psychology" )

When I knit, I get the following error: Error: $ operator is invalid for atomic vectors Execution halted

I looked up the error and learned that it happens if you try to access an element of a single vector with "$". For example:

> x <- c(1, 3, 7, 6, 2)
> names(x) <- c('a', 'b', 'c', 'd', 'e')
> x
a b c d e 
1 3 7 6 2 
> x$e
Error in x$e : $ operator is invalid for atomic vectors

In the example above, x is not a dataframe, so I can't access the '2' with x$e.

I have carefully looked through the code you provided for step 9, and I don't see the '$' symbol being used this way. Am I missing something? Any other ideas for why I'm getting this error?

lecy commented 2 years ago

It means you probably created an empty data frame at a subset or filter step. Check the data frame before the step, then walk through the pipe one step at a time to make sure you are not dropping all observations.

mtwelker commented 2 years ago

Thank you! That helped me identify my problem. I had used pander() inside a function, so the result was no longer a dataframe.

ndavis4904 commented 2 years ago

I am having a problem with the same step. I made similar changes, but have an error message "Error in build_graph(., iq_range, unit = "Psychology"): Unused argument (iq_range). iq_range is what I used in the last step as the name for the dataframe that would be created with create_salary_table(). Is this because of an empty dataframe as well? Or something different entirely?

lecy commented 2 years ago

@ndavis4904 can you share the code that produced that error? It’s hard to tell without seeing what you are running.

ndavis4904 commented 2 years ago

`add_position <- function( t, position, y, xmax, scale.f=8 ) {

t.original <- t t <- filter( t.original, titles==position ) dot.size <- 2 + scale.fsum(t$p) offset.n <- 1 + sum(t$p)2

male.median <- NA n.male <- NA t <- filter( t.original, titles==position & gender == "male" ) if( nrow(t) > 0 ) { male.median <- t$q50 n.male <- t$n }

female.median <- NA n.female <- NA t <- filter( t.original, titles==position & gender == "female" ) if( nrow(t) > 0 ) { female.median <- t$q50 n.female <- t$n }

dumbell plots

segments( x0=female.median, x1=male.median, y0=y, col=gray(0.3,0.5), lwd=7 ) points( male.median, y, col=adjustcolor( "darkblue", alpha.f = 0.5), pch=19, cex=dot.size ) points( female.median, y, col=adjustcolor( "firebrick", alpha.f = 0.5), pch=19, cex=dot.size )

pos.f <- 2 pos.m <- 4 if( ! ( is.na(female.median) | is.na(male.median) ) ) { pos.f <- ifelse( female.median > male.median, 4, 2 ) pos.m <- ifelse( female.median > male.median, 2, 4 ) }

add salaries to right and left

text( female.median, y, paste0("$",round(female.median/1000,0),"k"), col=adjustcolor( "firebrick", alpha.f = 0.7), cex=1.2, pos=pos.f, offset=offset.n ) text( male.median, y, paste0("$",round(male.median/1000,0),"k"), col=adjustcolor( "darkblue", alpha.f = 0.7), cex=1.2, pos=pos.m, offset=offset.n )

add faculty counts

n.female <- ifelse( is.na(n.female), 0, n.female ) n.female <- ifelse( nchar(n.female)==1, paste0( " ", n.female), n.female ) n.male <- ifelse( is.na(n.male), 0, n.male ) n.male <- ifelse( nchar(n.male)==1, paste0( " ", n.male), n.male ) text( xmax-0.1xmax, y+0.14, paste0( "f = ", n.female), col="gray50", cex=1.1, pos=4 ) text( xmax-0.1xmax, y-0.14, paste0( "m = ", n.male), col="gray50", cex=1.1, pos=4 )

axis( side=2, at=y, labels=position, las=2, tick=F, cex.axis=1.5, col.axis="gray50" ) }

build_graph <- function( t.salary, unit ) { unique.titles <- unique( t.salary$titles ) ymax <- length(unique.titles) xmax <- round( max(t.salary$q50), -3 ) + 50000 color.key.pos <- 40000 + ( xmax - 40000 ) / 2 color.key.inc <- ( xmax - 40000 ) / 10

t.mf <- filter( t.salary, gender %in% c("male","female") ) N <- sum( t.mf$n )

par( mar=c(6,15,4.1,0) ) plot.new() plot.window( xlim=c(40000-10000,xmax), ylim=c(0,ymax+1) )

abline( v=seq(40000,xmax-40000,20000), lwd=1.5, lty=2, col=gray(0.5,0.5) ) axis( side=1, at=seq(40000,xmax-40000,20000), labels=paste0("$",seq(40,(xmax-40000)/1000,20),"k"), cex.axis=1.1, col.axis="gray40", tick=FALSE )

y <- ymax

if( "Full Professor" %in% unique.titles ) { add_position( t.salary, position="Full Professor", y, xmax ) y <- y-1 } if( "Associate Professor" %in% unique.titles ) { add_position( t.salary, position="Associate Professor", y, xmax ) y <- y-1 } if( "Assistant Professor" %in% unique.titles ) { add_position( t.salary, position="Assistant Professor", y, xmax ) y <- y-1 } if( "Teaching Faculty" %in% unique.titles ) { add_position( t.salary, position="Teaching Faculty", y, xmax ) y <- y-1 } if( "Researcher" %in% unique.titles ) { add_position( t.salary, position="Researcher", y, xmax ) y <- y-1 }

text( color.key.pos + 3color.key.inc, 0, "MALE", col=adjustcolor( "darkblue", alpha.f = 0.7), cex=1.2 ) text( color.key.pos + 1.8color.key.inc, 0, "FEMALE", col="firebrick", cex=1.2 ) text( xmax - 0.1*xmax, 0, paste0("N = ",N), col="gray40", cex=1.2, pos=4 )

title( main="Median Salary by Rank and Gender", cex.main=1.5, col.main="gray30" ) title( xlab=unit, col.lab="gray50", cex.lab=1.5, line=5 ) title( xlab="dot size represents proportion of faculty at that rank", col.lab="gray50", cex.lab=0.9 )

return(NULL) }

d2 %>%
filter( Department.Description == "Psychology" ) %>% create_salary_table( ) %>% build_graph(iq_range, unit="Psychology" )`

iq_range is what I used instead of your t.salary from the previous step. Then the error is: Error in build_graph(., iq_range, unit = "Psychology") : unused argument (iq_range) calls: ... withCallingHandlers -> withvisible -> eval -> eval -> %>%

lecy commented 2 years ago

The build graph function is expecting a specific type of table:

build_graph <- function( t.salary, unit )

If you use a different type of data frame or object the function won’t be able to find the info it needs to build the graph.

What is iq_range here?

ndavis4904 commented 2 years ago

That's the dataframe that we created with the create_salary_table() function.

lecy commented 2 years ago

If you just remove iq range does this work then?

d2 %>%
filter( Department.Description == "Psychology" ) %>%
create_salary_table( ) %>%
build_graph(iq_range, unit="Psychology" )
lecy commented 2 years ago

That's the dataframe that we created with the create_salary_table() function.

Since you are piping results the data frame will be passed forward from the previous step and is the implied first argument:

d %>%
  create_salary_table( ) %>%
  build_graph( [INVISIBLE DF], unit="Psychology" )
ndavis4904 commented 2 years ago

It creates a graph, there's just no data within the graph. image

lecy commented 2 years ago

It’s probably an issue with the table step then:

d %>%
  create_salary_table( )
ndavis4904 commented 2 years ago

It gives me a message that says summarise() has grouped output by 'titles'. You can override using the .groups argument. Would that be the cause? In which case it would be in my function that I would need to look.

lecy commented 2 years ago

Did you use this code?

https://watts-college.github.io/cpp-527-fall-2021/labs/final-project-instructions.html#step-8-summarize-salaries

ndavis4904 commented 2 years ago

Yes I did. That's the code that I put into the create_salary_table() function.

lecy commented 2 years ago

Including pander()?

ndavis4904 commented 2 years ago

Originally it was outside of the function. I just changed it, and it fixed the sorting message in step 8, but I still have the same problem in step 9.

lecy commented 2 years ago

You don’t want to include it inside the function. It converts the table into text, so you would not be able to use it in subsequent steps.

See if you can get that table working before trying Step 9

ndavis4904 commented 2 years ago

Is the 'title' variable supposed to be the list of code_titles() from an earlier step? It is still saying that it is grouping by titles, but other than that it gives a table with the quartiles.

lecy commented 2 years ago

Correct. Also helpful if you share the return table.

If the table works the graphing function should work. I would need to see output to assess though.

ndavis4904 commented 2 years ago

I set up a time to meet with you during office hours tomorrow.