Closed adibender closed 4 years ago
@bauer-alex The errors on CRAN are due to #118 that you fixed. But I wonder if there is something else we need to consider?
'Packages which use Internet resources should fail gracefully with an informative message if the resource is not available (and not give a check warning nor error).'
Ok... Cannot be really due to my commit though, the change wasn't that severe, and didn't change any web scraping functionality whatsoever. Did we already have some informative error message (or whatever is needed) up to now for when the resource is not available (or when there is no Internet connection)?
I could take a look at it on Monday at the earliest.
No, I meant that the error was because of the changes at wahlrecht.de and it will be fixed as soon as we push your commits. But I think the comment from CRAN goes further than the specific error.
Based on feedback from CompStat people I think we need to: a) Within the scrapper functions check if connection to wahlrecht (or other specific URLs) can be established and return informative error messages if it does not. b) Check if the dowloaded tables have the proper dimensions/properties before preprocessing and through informative error otherwise c) Construct a "switch" or something like that for tests/examples, where only tests that do not depend on internet resources are run on CRAN. The scrapper tests can still be run locally/on github/travis, but not on CRAN.
The robustify_scrapers
branch contains some changes to resolve the issues:
a) Within the scrapper functions check if connection to wahlrecht (or other specific URLs) can be established and return informative error messages if it does not.
The function try_readHTML
(I initially called it try_htmlSource
, sorry for the confusion!) gives an error message when the website cannot be resolved. We could also check if an internet connection is active in the first place, but a short research lead me to no neat solution.
b) Check if the dowloaded tables have the proper dimensions/properties before preprocessing and through informative error otherwise
I think that's principally good practice, but I don't think we need this to resolve the current CRAN issues. Accordingly, I did nothing in this regard so far.
c) Construct a "switch" or something like that for tests/examples, where only tests that do not depend on internet resources are run on CRAN. The scrapper tests can still be run locally/on github/travis, but not on CRAN.
The tests that depend on a web connection now contain testthat::skip_on_cran()
calls. This causes all subsequent tests in the current test_that()
call to be skipped on CRAN.
@adibender Would be great if you could take up the work again from here. I'm quite busy the next days.
Perfect, thx! Will do.
https://cran.r-project.org/web/checks/check_results_coalitions.html