Open chmao99 opened 2 years ago
Hi @chmao99, it looks like the table uses the full-time salary.
The instructions page shows that the varlable salary
is created to reflect full-time:
salary <- d$salary / (d$FTE / 100)
Then, in the code for the table, it makes a call to salary
, (lowercase "s"):
t.salary <-
d %>%
filter( ! is.na( title ) & title != "" ) %>%
group_by( title, gender ) %>%
summarize( q25=quantile(salary,0.25),
q50=quantile(salary,0.50),
q75=quantile(salary,0.75),
n=n() ) %>%
ungroup() %>%
mutate( p= round( n/sum(n), 2) )
pander( t.salary )
That is what I used in my code for the project.
Thanks for your quick reply! However, may "salary" in summarize() be "d$salary" since you are in the pipeline of "d"? I added a salary.fulltime variable in "d", and got exactly the same data table when I used original salary but a little bit different one when using salary.fulltime.
Does your salary.fulltime
variable match salary <- d$salary / (d$FTE / 100)
?
Yes, this is my code.
if( max(d$FTE) == 100 ) { salary <- d$Salary / (d$FTE/100) } if( max(d$FTE) == 1 ) { salary <- d$Salary / d$FTE } d <- cbind(d, Salary.Fulltime = salary)
And in the function of "create_salary_table() ", when I use original salary data like this
summarize( q25=quantile(Salary,0.25), q50=quantile(Salary,0.50), q75=quantile(Salary,0.75), n=n() ) %>%
I got exactly the same table in your instruction page. I also tried "salary" like yours, and change "Salary" to "salary" in the function. It returns the table below.
title gender q25 q50 q75 n p
Full Professor male 57464 90561 117731 338 0.14
Full Professor female 57464 90561 117731 141 0.06
Full Professor uncoded 57464 90561 117731 56 0.02
Associate Professor male 57464 90561 117731 229 0.09
Associate Professor female 57464 90561 117731 180 0.07
Associate Professor uncoded 57464 90561 117731 52 0.02
Assistant Professor male 57464 90561 117731 147 0.06
Assistant Professor female 57464 90561 117731 141 0.06
Assistant Professor uncoded 57464 90561 117731 66 0.03
Teaching Faculty male 57464 90561 117731 319 0.13
Teaching Faculty female 57464 90561 117731 378 0.15
Teaching Faculty uncoded 57464 90561 117731 57 0.02
Researcher male 57464 90561 117731 169 0.07
Researcher female 57464 90561 117731 114 0.05
Researcher uncoded 57464 90561 117731 72 0.03
To be honestly, I can not fully understand your code, salary is just an independent vector?
Hi @chmao99!
Yes, I just pulled the code from the instructions page. This salary <- d$salary / (d$FTE / 100)
just returns a vector.
In your code, it should work as expected. You could also try this:
if( max(d$FTE) == 100 )
{ salary <- d$Salary / (d$FTE/100) }
if( max(d$FTE) == 1 )
{ salary <- d$Salary / d$FTE }
#d <- cbind(d, Salary.Fulltime = salary)
d$salary <- salary
Note how I did the last line differently.
Hope that helps!
Thanks a lot for your nice reply!
However, I still think that tables and graph in Part 1 of the instruction paper generated from original salary. In step 10, Prescott, Edward, the person who rank No. 5, whose original salaries is $340,159. I double checked it from the original dataset as follows. And it is the original salary. Since his FTE is 78, he should ranked No. 1 in the full-time salary data-set.
` which(d$Full.Name == "Prescott, Edward") [1] 8989
d[8989,] Calendar.Year Full.Name Job.Description Department.Description 8989 2020 Prescott, Edward Regents Professor WPC Economics Salary FTE 8989 $340,159.00 78`
In final project Part I step 6, we convert salary to full-time salary. However, I believe that the table in step 8 was generated by original salary data . Should we use full-time salary for our further steps?