Closed jasminacosta closed 2 months ago
This is a new issue caused by changes to GitHub. For some reason when you download the RMD file from GitHub it is stripping away the yaml header:
---
title: "Batch Report Demo"
output:
html_document:
theme: readable
highlight: zenburn
toc: true
params:
url:
value: x
---
After you add that back to your RMD doc it should work. It should look like this:
https://github.com/Watts-College/paf-514-template/blob/main/labs/batch-demo/salary-report.rmd
@lecy I added the yaml header but it is still not running the RMD doc.
I ran it today and it worked so I know the code is ok.
Try it again. Might be a bad connection.
You can test the data load step like this:
url.2020 <- "https://docs.google.com/spreadsheets/d/1RoiO9bfpbXowprWdZrgtYXG9_WuK3NFemwlvDGdym7E/export?gid=1335284952&format=csv"
d <- read.csv( url.2020 )
I tried the following code provided and it did run properly.
Should I replace this part of the code and replace it with the new code provided?
# LOAD DATA
URL <- params$url #replace this part of the code
d <- read.csv( URL )
Replace with the following code:
url.2020 <- "https://docs.google.com/spreadsheets/d/1RoiO9bfpbXowprWdZrgtYXG9_WuK3NFemwlvDGdym7E/export?gid=1335284952&format=csv"
d <- read.csv( url.2020 ) #use this part of code instead?
For testing purposes that's fine. If you replace the current code in the RMD doc, though, you are hard-coding a single year of data into your RMD template, and it can no longer be used as part of a batch process to create reports across multiple years.
You don't actually knit the RMD file in this case. It is a template that you execute through the batch file:
Before you go on to next steps, download these three files do your local working directory. Open batch.R in a regular R console and try running the reports to create the HTML files.
# utils.R and salary-report.rmd should be in the working directory
## 2020 REPORT
url.2020 <- "https://docs.google.com/spreadsheets/d/1RoiO9bfpbXowprWdZrgtYXG9_WuK3NFemwlvDGdym7E/export?gid=1335284952&format=csv"
rmarkdown::render( input='salary-report.rmd',
output_file = "ASU-2020-Salary-Report.HTML",
params = list( url = url.2020 ) )
It will make more sense after you finish Lab-06 and use the Resume template.
@lecy So is it fine for me to start working on? Despite these errors popping up?
You shouldn't be getting errors if you are running it correctly.
My question is whether you are executing as described in the assignment?
"Before you go on to next steps, download these three files do your local working directory. Open batch.R in a regular R console and try running the reports to create the HTML files."
# utils.R and salary-report.rmd should be in the working directory
## 2020 REPORT
url.2020 <- "https://docs.google.com/spreadsheets/d/1RoiO9bfpbXowprWdZrgtYXG9_WuK3NFemwlvDGdym7E/export?gid=1335284952&format=csv"
rmarkdown::render( input='salary-report.rmd',
output_file = "ASU-2020-Salary-Report.HTML",
params = list( url = url.2020 ) )
Are you running this rmarkdown::render() command from an R console with utils.R and salary-report.rmd saved in your current working directory?
@lecy I used getwd() to see what the directory is for each file and these are the results.
For utilis.R the directory is:
> getwd()
[1] "/Users/jasminacosta/montyhall"
For salary-report.rmd it is:
> getwd()
[1] "/Users/jasminacosta/Downloads"
Then for batch.R it is:
> getwd()
[1] "/Users/jasminacosta/montyhall"
However, I saw that all my files were in my downloads folder.
I am really confused.
If all of your files are in the downloads folder, you should be able to knit the report as follows:
setwd( "/Users/jasminacosta/Downloads" )
## 2020 REPORT
url.2020 <- "https://docs.google.com/spreadsheets/d/1RoiO9bfpbXowprWdZrgtYXG9_WuK3NFemwlvDGdym7E/export?gid=1335284952&format=csv"
rmarkdown::render( input='salary-report.rmd',
output_file = "ASU-2020-Salary-Report.HTML",
params = list( url = url.2020 ) )
Try that and let me know if it works.
@lecy It seemed to work for the batch.R and utilis.R, but I still having issues with the salary-report.rmd.
You can't always knit a template RMD directly in RStudio. You would knit using the command in batch.R:
setwd( "/Users/jasminacosta/Downloads" )
## 2020 REPORT
url.2020 <- "https://docs.google.com/spreadsheets/d/1RoiO9bfpbXowprWdZrgtYXG9_WuK3NFemwlvDGdym7E/export?gid=1335284952&format=csv"
rmarkdown::render( input='salary-report.rmd',
output_file = "ASU-2020-Salary-Report.HTML",
params = list( url = url.2020 ) )
When you run this do you create an HTML file called "ASU-2020-Salary-Report.HTML"?
@lecy Yes, I believe so.
Check your file list to confirm:
dir()
Or else open your downloads folder and look for the file.
You can also open files from R with shell():
# assuming you are in the right wd
shell( "ASU-2020-Salary-Report.HTML" )
@lecy This is the output I am getting and I am not sure why I am do getting the directory dir()
Selection:
means it's waiting for an instruction while trying to install programs. You need to install genderdata before you can run the file.
library( gender )
gender("sara")
The genderdata package needs to be installed.
Install the genderdata package?
1: Yes
2: No
Selection:
Also a good example of why to share code instead of screen shots. In the base R console your code would look like this:
setwd("ds2")
url.2020 <- "https://docs.google.com/spreadsheets/d/1RoiO9bfpbXowprWdZrgtYXG9_WuK3NFemwlvDGdym7E/export?gid=1335284952&format=csv"
rmarkdown::render( input='salary-report.rmd',
output_file = "ASU-2020-Salary-Report.HTML",
params = list( url = url.2020 ) )
processing file: salary-report.rmd
| | 0%
|........... | 14%
|...................... | 29% [setup]
|................................. | 43%
|............................................ | 57% [unnamed-chunk-1]Selection:
That would give a clue about where the script is getting stuck and why.
Your screen shot is not reproducible - it doesn't show what code was run and which behavior it produced.
[unnamed-chunk-1] # where the problem occurs
Selection: # which message you are receiving
@lecy Understood.
I installed gender data and tried to run the code in the R console.
setwd( "/Users/jasminacosta/Downloads" )
## 2020 REPORT
url.2020 <- "https://docs.google.com/spreadsheets/d/1RoiO9bfpbXowprWdZrgtYXG9_WuK3NFemwlvDGdym7E/export?gid=1335284952&format=csv"
rmarkdown::render( input='salary-report.rmd',
output_file = "ASU-2020-Salary-Report.HTML",
params = list( url = url.2020 ) )
And I am getting this error:
processing file: salary-report.rmd
|.................................................. | 86% [unnamed-chunk-2]
Quitting from lines 80-100 [unnamed-chunk-2] (salary-report.rmd)
Error in `group_by()`:
! Must group by variables found in `.data`.
Column `title` is not found.
Column `gender` is not found.
Backtrace:
1. ... %>% mutate(p = round(n / sum(n), 2))
6. dplyr:::group_by.data.frame(., title, gender)
Try closing R Studio and executing from a base R console, please.
@lecy I closed R studio and executed the code from the R console and I am still getting the same error.
Your data is not loading correctly but I'm not sure if it is a problem with your package setup, a problem sourcing the utils.R file because of directory changes, or maybe a connection issue with Google Sheets.
Can you please try the following and tell me what you get:
library( dplyr )
library( pander )
library( knitr )
library( gender )
source( "utils.R" )
URL <- "https://raw.githubusercontent.com/Watts-College/paf-514-template/main/labs/batch-demo/asu-salaries-2020.csv"
d <- read.csv( URL )
d$first.name <- get_first_name( d$Full.Name )
d <- add_gender( d )
d <- add_titles( d )
d <- fix_salary( d )
d <-
d %>%
filter( title != "" & ! is.na(title) ) %>%
filter( Department.Description %in% academic.units )
head(d) %>% pander::pander()
@lecy I got the following when I ran it through the R console.
----------------------------------------------------------------
first.name Calendar.Year Full.Name Job.Description
------------ --------------- ----------------- -----------------
Aaron 2020 Baker, Aaron Professor
Aaron 2020 Fellmeth, Aaron Professor
Aaron 2020 Redman, Aaron Instructor
Aaron 2020 Crippen, Aaron Instructor
Aaron 2020 Bae, Aaron Lecturer
Aaron 2020 Romans, Aaron Instructor
----------------------------------------------------------------
Table: Table continues below
------------------------------------------------------------------------------
Department.Description Salary FTE gender title
------------------------------ ------------- ----- -------- ------------------
English $107,160.00 100 male Full Professor
College Of Law $164,755.00 100 male Full Professor
SOS Faculty & Researchers $50,000.00 100 male Teaching Faculty
English $52,600.00 100 male Teaching Faculty
School of Social Transform $52,750.00 100 male Teaching Faculty
Social & Behavioral Sciences $47,979.00 80 male Teaching Faculty
------------------------------------------------------------------------------
Table: Table continues below
--------
salary
--------
107160
164755
50000
52600
52750
59974
--------
Ok now add:
t.salary <-
d %>%
group_by( title, gender ) %>%
summarize( q25=quantile(salary,0.25),
q50=quantile(salary,0.50),
q75=quantile(salary,0.75),
n=n() ) %>%
ungroup() %>%
mutate( p= round( n/sum(n), 2) )
t.salary %>% build_graph( unit="ALL ASU")
@lecy I ran that also into the R console and received these results:
> t.salary <-
+ d %>%
+ group_by( title, gender ) %>%
+ summarize( q25=quantile(salary,0.25),
+ q50=quantile(salary,0.50),
+ q75=quantile(salary,0.75),
+ n=n() ) %>%
+ ungroup() %>%
+ mutate( p= round( n/sum(n), 2) )
`summarise()` has grouped output by 'title'. You can override using the `.groups`
argument.
>
> t.salary %>% build_graph( unit="ALL ASU")
NULL
Along with a graph of salaries.
Last thing to test, then:
URL <- "https://docs.google.com/spreadsheets/d/1RoiO9bfpbXowprWdZrgtYXG9_WuK3NFemwlvDGdym7E/export?gid=1335284952&format=csv"
d <- read.csv( URL )
If that works then I'm guessing if you shut down R and open it again then try once more with a fresh console it will work. It could have been a ghost in the machine from other files you had open.
Otherwise I am stumped because it works on my computer and all of the steps above work fine. If you still get an error please email me your RMD and util.R files.
(I see that you are still running your files from R Studio, not a base R console. I don't think that would be the problem, though. Let's see if it works.)
@lecy I ran that as well in the R console and it worked fine.
I shut down R and ran the code to see if the salary-report.rmd would run properly, but I am still getting the same error.
Let me send over my files.
Check your email - you added extra code to the salary-template.rmd file that was overwriting the prior data steps. If you use the original version it should work fine.
@lecy Oh, I see!
I removed those lines and when I ran the code:
setwd( "/Users/jasminacosta/Downloads" )
## 2020 REPORT
url.2020 <- "https://docs.google.com/spreadsheets/d/1RoiO9bfpbXowprWdZrgtYXG9_WuK3NFemwlvDGdym7E/export?gid=1335284952&format=csv"
rmarkdown::render( input='salary-report.rmd',
output_file = "ASU-2020-Salary-Report.HTML",
params = list( url = url.2020 ) )
I got the output
processing file: salary-report.rmd
output file: salary-report.knit.md
/usr/local/bin/pandoc +RTS -K512m -RTS salary-report.knit.md --to html4 --from markdown+autolink_bare_uris+tex_math_single_backslash --output ASU-2020-Salary-Report.HTML --lua-filter /Library/Frameworks/R.framework/Versions/4.3-arm64/Resources/library/rmarkdown/rmarkdown/lua/pagebreak.lua --lua-filter /Library/Frameworks/R.framework/Versions/4.3-arm64/Resources/library/rmarkdown/rmarkdown/lua/latex-div.lua --embed-resources --standalone --variable bs3=TRUE --section-divs --table-of-contents --toc-depth 3 --template /Library/Frameworks/R.framework/Versions/4.3-arm64/Resources/library/rmarkdown/rmd/h/default.html --highlight-style zenburn --variable theme=readable --mathjax --variable 'mathjax-url=https://mathjax.rstudio.com/latest/MathJax.js?config=TeX-AMS-MML_HTMLorMML' --include-in-header /var/folders/tw/fh99fdy16cg1d8ppmstjmp5w0000gn/T//RtmpI0n6uY/rmarkdown-str121c72b99f561.html
Output created: ASU-2020-Salary-Report.HTML
Then I went to knit the salary-report.rmd and I got the error:
Not sure if that is what is supposed to happen.
processing file: salary-report.rmd
Quitting from lines 30-66 [unnamed-chunk-1] (salary-report.rmd)
Error in `file()`:
! cannot open the connection
Backtrace:
1. utils::read.csv(URL)
2. utils::read.table(...)
3. base::file(file, "rt")
Execution halted
Output created: ASU-2020-Salary-Report.HTML
Success! You can preview the HTML file with:
shell( "ASU-2020-Salary-Report.HTML" )
This is how you are knitting your report template:
rmarkdown::render( input='salary-report.rmd', ... )
When you are creating batches of dozens or hundreds of reports you would not want a separate RMD file for each report that you have to open in R Studio and knit manually.
The steps above are to test your environment - ensure the packages are installed and working correctly, that you can find your files in your project directory, that the rmarkdown package can call pandoc ok, etc. This step was designed to identify project configuration errors before you start implementing the steps on your own.
@lecy Yay!
Got it, that makes sense.
In that case would I be able to begin the project, or do I need to run that code and it needs to run properly?
rmarkdown::render( input='salary-report.rmd' )
Because when I ran that code to see what would happen I got the error:
processing file: salary-report.rmd
|................................. | 57% [unnamed-chunk-1]
Quitting from lines 30-66 [unnamed-chunk-1] (salary-report.rmd)
Error in `file()`:
! cannot open the connection
Backtrace:
1. utils::read.csv(URL)
2. utils::read.table(...)
3. base::file(file, "rt")
Just want to make I am setting up my files correctly before I start adding code to it!
The dots were for the omitted parts:
## 2020 REPORT
url.2020 <- "https://docs.google.com/spreadsheets/d/1RoiO9bfpbXowprWdZrgtYXG9_WuK3NFemwlvDGdym7E/export?gid=1335284952&format=csv"
rmarkdown::render( input='salary-report.rmd',
output_file = "ASU-2020-Salary-Report.HTML",
params = list( url = url.2020 ) )
You should be all set.
@lecy Understood.
Thank you for all your help!
Hi @lecy!
Before starting the final project steps I am having issues loading the files to make sure they work correctly.
Note: I did download Pandoc and update the Rmarkdown package.