tbrown122387 / gradeR

helps grade R script assignment submissions!
Other
4 stars 4 forks source link

tidyverse breaks things #10

Closed caseywdunn closed 3 years ago

caseywdunn commented 4 years ago

Hi- Everything is working great for my assignments, except that if I have library(tidyverse) or library(ggplot) in my code all tests of variables set in the assignment show as failing on gradescope. Dummy tests, eg expect_equal( 1 , 1 ) show as passing on gradescope. I poked around I bit, and the best hunch I have is that there is some namespace pollution where something from ggplot is clobbering testthat.

I tried stepping through the code and at https://github.com/tbrown122387/gradeR/blob/48166a241c5384cde956317dc157c091488f4e26/R/grade.r#L210 got the following:

tryCatch(source(submission_file, testEnv),  error = function(c) c, warning = function(c) c ,message = function(c) c)

<packageStartupMessage in packageStartupMessage(text_col(...)): ── Attaching packages ─────────────────────────────────────── tidyverse 1.3.0 ──

It is hard to provide an example failure, but it would be nice to know if you can break a working example just by adding library(tidyverse). Out of curiosity I opened a new R session and loaded testthat and tidyverse and there are a few key testthat funtsions that dplyr functions clobber.

Desktop % R

R version 4.0.2 (2020-06-22) -- "Taking Off Again"
Copyright (C) 2020 The R Foundation for Statistical Computing
Platform: x86_64-apple-darwin17.0 (64-bit)

R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type 'license()' or 'licence()' for distribution details.

  Natural language support but running in an English locale

R is a collaborative project with many contributors.
Type 'contributors()' for more information and
'citation()' on how to cite R or R packages in publications.

Type 'demo()' for some demos, 'help()' for on-line help, or
'help.start()' for an HTML browser interface to help.
Type 'q()' to quit R.

> library(testthat)
> library(tidyverse)
── Attaching packages ─────────────────────────────────────── tidyverse 1.3.0 ──
✔ ggplot2 3.3.2     ✔ purrr   0.3.4
✔ tibble  3.0.3     ✔ dplyr   1.0.0
✔ tidyr   1.1.0     ✔ stringr 1.4.0
✔ readr   1.3.1     ✔ forcats 0.5.0
── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
✖ dplyr::filter()  masks stats::filter()
✖ purrr::is_null() masks testthat::is_null()
✖ dplyr::lag()     masks stats::lag()
✖ dplyr::matches() masks tidyr::matches(), testthat::matches()
tbrown122387 commented 4 years ago

Hey @caseywdunn I can take a closer look, but my first guess is that you didn't install those packages via the setup bash script. If you're using this example, as a template, then you have to add a line to this file to tell the Docker thingy to install the third party R libraries. Something like

Rscript -e "install.packages('ggplot')" 

Regarding namespaces, that's probably each student's responsibility--if they load in libraries, then they should know that doing that will mask some of the names potentially. I would guess that their code should run the same on both their machine and Gradescope's machines/servers.

But for both of these reasons, that's why I usually ask them not to use 3rd party libraries unless they're "approved."

brevans commented 4 years ago

Thank you for your work on this package. It really is super useful!

Just to add to what @caseywdunn is saying, the testthat::is_null and testthat::matches functions tidyverse is colliding with are marked as deprecated. Any straightforward way to be more restrictive about what gradeR is pulling in from testthat?

Given that the Tidyverse is approved for use in the class (and installed in the container image we're using for Gradescope), our temporary fix is to simply force the testthat functions to have precedence -

library(gradeR)
library(tidyverse)

matches <- testthat::matches
is_null <- testthat::is_null

Then we just need to notify students that these two functions are used by the grading system, they should use the purr:: or dplyr:: namespaces explicitly if needed.

Might move to using conflicted in future.

tbrown122387 commented 4 years ago

@caseywdunn @brevans

"Any straightforward way to be more restrictive about what gradeR is pulling in from testthat?"

Yes, but I believe I'm already doing it in the restrictive way.

In line 11 of the DESCRIPTION file, I ask that testthat be imported, but that doesn't mean it's automatically attached whenever someone calls library(gradeR).

In the source for both functions, I always use the scope resolution operator (e.g. testthat::test_file(yadayadayada) every time I call something in testthat. As a result, it shouldn't attach all the functions (e.g. library(testthat)).

On the other hand, it could be that my calls are calling other deprecated testthat functions...I'm not sure, but I can check. It would be nice to have a minimal reproducible example, though. @caseywdunn if the error is in tryCatch(source(submission_file, testEnv), error = ... that means that gradeR can't source the student submission. Are we sure the student submission is free of errors? They could be masking their own functions.

tbrown122387 commented 4 years ago

@caseywdunn @brevans maybe try removing library(test_that) from the file with all of the tests in it

tbrown122387 commented 4 years ago

I took the basic example, added

Rscript -e "install.packages('tidyr')" 

to setup.sh, and added the following line to the student submission file:

library(tidyr)

and it indeed broke. And the problem doesn't happen with other packages. Well, for me there is no error. It's just that all the tests that used to pass now fail. I'll let you know what I find.

caseywdunn commented 4 years ago

Glad to hear you were able to reproduce it. That is the fail mode that we saw. The fix posted above at https://github.com/tbrown122387/gradeR/issues/10#issuecomment-685027860 does resolve the problem, but there may be a more general way to do it.

Thanks again for such a great tool - I have my first class today using it but some students have already submitted their assignments and it is working well. I cam at this thinking it would help grading load given that we have larger enrollment than anticipated, but the real value is turning out to be the incremental feedback to students and starting them off with tools that get them in a test-driven-design mindset. It is basically a way to get them start with continuous integration without telling them that's what it is. I also learned a lot about environments etc going through your code, it is very nicely done.

tbrown122387 commented 4 years ago

Thanks, yeah, this is weird, so I'm going to keep looking into it.

In other totally unrelated news, I just pushed to CRAN some extra features for calcGradesForGradescope: adding an option to suppress warnings emitted from student submission scripts, and more options on how test outcomes show up on the student end on Gradescope (before, hidden tests stayed hidden even after the deadline passed).

Just wanted to mention this, because this is my first semester using this for a class as well. Both of these things were pretty heavily requested.

tbrown122387 commented 4 years ago

closed by 3772ffbe52126bbf43c16c0c42a6d77eed931922 student submissions are now sourced in a separate process, keeping the search path clean

tbrown122387 commented 4 years ago

I think I have to open this back up.

  1. Certain functions that students call (e.g. reshape, is, ...) don't work when the script runs on Gradescope, but works on their machines.
  2. I just threw a library(tidyverse) call into a student submission, and that failed all the tests

Even though separate processes give each student their own separate, clean search path, I'm still calling testthat::test_file before I spawn these processes. The body of this function calls library(testthat) in the first line. The authors of that package are unwilling to look into that.

Two possible fixes maybe:

Also, whether or not things break seems to have to do with whether or not it on Gradescope servers or not. I'm not sure what is causing this at the moment.

tbrown122387 commented 3 years ago

Fixed . callr::r() needed the extra argument package = TRUE. dplyr was being installed on the Gradescope docker image, but callr::r() was not passing the package to the subprocess running the student submission.

Either wait for changes to take effect on CRAN or replace Rscript -e "install.packages('gradeR')" with Rscript -e "install.packages('devtools'); devtools::install_github('tbrown122387/gradeR')" in setup.sh