Watts-College / paf-514-template

https://watts-college.github.io/paf-514-template/
1 stars 0 forks source link

Part 3: Step 3 - Batch Error #61

Closed emhall3 closed 1 month ago

emhall3 commented 7 months ago

Hi Professor @lecy,

I am trying to render the reports for each year. When I render the 2020 report, everything works correctly and the resulting HTML file looks accurate. However, when I try to render the 2019 report, I get the following error:

`Quitting from lines 94-112 [unnamed-chunk-2] (salary-report.rmd)
Error in `plot.window()`:
! need finite 'xlim' values
Backtrace:
 1. global build_graph(salary.summary, unit = i)
 2. graphics::plot.window(...)
      at utils.R:195:2`

Line 94 is the start of my loop in the salary-report.rmd file. Here is the code for my for loop:

for( i in academic.units )
{

  d2 <- filter( d, Department.Description == i )
  salary.summary <- create_salary_table( d2 )# create the salary summary by rank & gender table 
  t.rank <- top_five_salaries( d2 )# create the top 5 salaries table 
  build_graph( salary.summary, unit=i )# build the graph 

  # print the tables:  
  cat( paste('<h3>',"TOP FIVE SALARIES",'</h3>' ) )
  cat( t.rank %>% knitr::kable(format="html") ) 

  cat( paste('<h3>', "PAY RANGE BY RANK AND GENDER",'</h3>' ) )
  cat( salary.summary %>% knitr::kable(format="html") ) 

  cat( '<br><hr><br>' )

}

If I manually change the URL to the 2019 data in the salary-report.rmd file and test the for loop from there, the first few graphs and tables appear, but then I receive the following error:

Error in plot.window(xlim = c(40000 - 10000, xmax), ylim = c(0, ymax +  : 
  need finite 'xlim' values

Please advise. Let me know if you need additional information about my code. Thank you!!

lecy commented 7 months ago

This error hints at the problem, but it might not be obvious:

Error in plot.window(xlim = c(40000 - 10000, xmax), ylim = c(0, ymax +  : 
  need finite 'xlim' values

It occurs when you have an empty data frame (a data frame with no rows), which would be generated by filtering the data by some criteria:

d2 <- filter( d, Department.Description == i )

This type of error is common in batch reporting because data representation (for example, FTE=100 vs FTE=1.00) and data completeness varies over time.

In this case you can see that several departments are missing in 2019:

URL <- "https://docs.google.com/spreadsheets/d/1RoiO9bfpbXowprWdZrgtYXG9_WuK3NFemwlvDGdym7E/export?gid=1948400967&format=csv"
d <- read.csv(URL)
source( "utils.R" ) # load academic.units 
departments <- factor( d$Department.Description, lev=academic.units )
table( departments ) |> sort() |> knitr::kable()
departments Freq
Ldrshp and Integrative Studies 0
MDT Music 0
SOS Faculty & Researchers 0
WPC Supply Chain Management 39
College of Health Solutions NT 43
WPC Information Systems 43

Recall the week-02 simulation question that asks how long it takes on average to go bankrupt when you start with $10 and win or lose a dollar with each turn. There were some cases where a winning streak would result in a large cash holding that would more or less ensure the player would never go broke, for all practical purposes. As a result, the simulation could run for days and never produce an answer to the question. We added a condition that if the game exceeds 10,000 plays and the player was still not broke then to end the game at 10,000 rounds by using a break statement:

cash <- 10
loop.count <- 1
while( cash > 0 )
{ 
  cash <- cash + sample( (-1,1 ), 1 )
  loop.count <- loop.count + 1
  if( loop.count > 10000 ){ break }
}

Similarly, here you will want to make your batch processing of reports robust by adding some sanity checks that will prevent your program from crashing when it encounters an edge case. Instead of the break command, which would end the loop completely once you encounter the condition, you will use a next statement, which simply skips the rest of the code for the current department and starts back at the top of the loop with the i+1 case. In other words, it would just skip the departments that have no faculty for those given years.

for( i in academic.units )
{
  d2 <- filter( d, Department.Description == i )
  if( nrow(d2) == 0 ){ next }
  ... 
}