Watts-College / cpp-527-fall-2021

A course shell for CPP 527 Foundations of Data Science II
https://watts-college.github.io/cpp-527-fall-2021/
2 stars 6 forks source link

Step 5: Convert Salary to Numeric #87

Open WSKQ23 opened 2 years ago

WSKQ23 commented 2 years ago

Hello @lecy I am having challenge with the cleaning of my salary data that I used to produce my salary summaries. I used the codes below to clean;

remove_dollar<- gsub( "\\$", " ", d$Salary )
  dollar_removed <- gsub(",", "", remove_dollar)
  salary <- head(dollar_removed %>% as.numeric())
 salary

but while producing my report I realized that all the values for q25, q50 and q75are the same. I think I have issue with cleaning the salary that I used for

create_salary_table <- function (dat3)
{
  t.salary <- 
    dat3 %>% 
    filter( ! is.na( title ) & title != "") %>% 
    group_by( title, gender ) %>% 
    summarize( q25=quantile(salary,0.25),
               q50=quantile(salary,0.50),
               q75=quantile(salary,0.75),
               n=n() ) %>% 
    ungroup() %>% 
    mutate( p= round( n/sum(n), 2) )

  return(t.salary)
}

Please any help sir

lecy commented 2 years ago

Are you ever assigning salary back to d?

remove_dollar<- gsub( "\\$", " ", d$Salary )
  dollar_removed <- gsub(",", "", remove_dollar)
  salary <- head(dollar_removed %>% as.numeric())
 salary

Try something like:

unstring_salary <- function(x)
{
  x <- gsub( "\\$", " ", x )
  x <- gsub(",", "", x )
  x <- as.numeric( x )
  return( x )
}

d$salary <- unstring_salary( d$Salary )