r-lib / urlchecker

Run CRAN URL checks from older versions of R
https://urlchecker.r-lib.org/
GNU General Public License v3.0
45 stars 5 forks source link

`Error in file(con, "r"): cannot open the connection` for 1 of 42 URLs #37

Open rempsyc opened 10 months ago

rempsyc commented 10 months ago

I experience the following error:

Warning in file(con, "r"): cannot open file 'C:\github\rempsyc/C:/github/rempsyc/vignettes/assumptions.Rmd': 
Invalid argument
Error in file(con, "r"): cannot open the connection

Reprex:

R.version
#>                _                                
#> platform       x86_64-w64-mingw32               
#> arch           x86_64                           
#> os             mingw32                          
#> crt            ucrt                             
#> system         x86_64, mingw32                  
#> status                                          
#> major          4                                
#> minor          3.0                              
#> year           2023                             
#> month          04                               
#> day            21                               
#> svn rev        84292                            
#> language       R                                
#> version.string R version 4.3.0 (2023-04-21 ucrt)
#> nickname       Already Tomorrow
packageVersion("urlchecker")
#> [1] '1.0.1'
urlchecker::url_check("C:/github/rempsyc")
#> fetching [  0 / 42 ]fetching [  1 / 42 ]fetching [  2 / 42 ]fetching [  3 / 42 ]fetching [  4 / 42 ]fetching [  5 / 42 ]fetching [  6 / 42 ]fetching [  7 / 42 ]fetching [  8 / 42 ]fetching [  9 / 42 ]fetching [ 10 / 42 ]fetching [ 11 / 42 ]fetching [ 12 / 42 ]fetching [ 13 / 42 ]fetching [ 14 / 42 ]fetching [ 15 / 42 ]fetching [ 16 / 42 ]fetching [ 17 / 42 ]fetching [ 18 / 42 ]fetching [ 19 / 42 ]fetching [ 20 / 42 ]fetching [ 21 / 42 ]fetching [ 22 / 42 ]fetching [ 23 / 42 ]fetching [ 24 / 42 ]fetching [ 25 / 42 ]fetching [ 26 / 42 ]fetching [ 27 / 42 ]fetching [ 28 / 42 ]fetching [ 29 / 42 ]fetching [ 30 / 42 ]fetching [ 31 / 42 ]fetching [ 32 / 42 ]fetching [ 33 / 42 ]fetching [ 34 / 42 ]fetching [ 35 / 42 ]fetching [ 36 / 42 ]fetching [ 37 / 42 ]fetching [ 38 / 42 ]fetching [ 39 / 42 ]fetching [ 40 / 42 ]fetching [ 41 / 42 ]                       
#> Warning in file(con, "r"): cannot open file
#> 'C:\github\rempsyc/C:/github/rempsyc/vignettes/assumptions.Rmd': Invalid
#> argument
#> Error in file(con, "r"): cannot open the connection

Created on 2023-10-09 with reprex v2.0.2

It seems for the 42 URL/file, the working directory is being appended to the URL. Any idea for a solution?

gaborcsardi commented 10 months ago

What do you get if you check the URLs with plain R CMD check?

rempsyc commented 10 months ago

With R CMD check, I get 0 error, 0 message, and 0 note—same on CRAN. And I see no warning about URLs.

rempsyc commented 10 months ago

On a different computer, also on Windows 10 but on an earlier R version and on a different directory, on which I have pulled the latest change, I get the same result:

R.version
#>                _                                
#> platform       x86_64-w64-mingw32               
#> arch           x86_64                           
#> os             mingw32                          
#> crt            ucrt                             
#> system         x86_64, mingw32                  
#> status                                          
#> major          4                                
#> minor          2.0                              
#> year           2022                             
#> month          04                               
#> day            22                               
#> svn rev        82229                            
#> language       R                                
#> version.string R version 4.2.0 (2022-04-22 ucrt)
#> nickname       Vigorous Calisthenics
packageVersion("urlchecker")
#> [1] '1.0.1'
urlchecker::url_check("D:/github/rempsyc")
#> fetching [  0 / 42 ]fetching [  1 / 42 ]fetching [  2 / 42 ]fetching [  3 / 42 ]fetching [  4 / 42 ]fetching [  5 / 42 ]fetching [  6 / 42 ]fetching [  7 / 42 ]fetching [  8 / 42 ]fetching [  9 / 42 ]fetching [ 10 / 42 ]fetching [ 11 / 42 ]fetching [ 12 / 42 ]fetching [ 13 / 42 ]fetching [ 14 / 42 ]fetching [ 15 / 42 ]fetching [ 16 / 42 ]fetching [ 17 / 42 ]fetching [ 18 / 42 ]fetching [ 19 / 42 ]fetching [ 20 / 42 ]fetching [ 21 / 42 ]fetching [ 22 / 42 ]fetching [ 23 / 42 ]fetching [ 24 / 42 ]fetching [ 25 / 42 ]fetching [ 26 / 42 ]fetching [ 27 / 42 ]fetching [ 28 / 42 ]fetching [ 29 / 42 ]fetching [ 30 / 42 ]fetching [ 31 / 42 ]fetching [ 32 / 42 ]fetching [ 33 / 42 ]fetching [ 34 / 42 ]fetching [ 35 / 42 ]fetching [ 36 / 42 ]fetching [ 37 / 42 ]fetching [ 38 / 42 ]fetching [ 39 / 42 ]fetching [ 40 / 42 ]fetching [ 41 / 42 ]                       
#> Warning in file(con, "r"): cannot open file
#> 'D:\github\rempsyc/D:/github/rempsyc/vignettes/assumptions.Rmd': Invalid
#> argument
#> Error in file(con, "r"): cannot open the connection

Created on 2023-10-10 with reprex v2.0.2

Perhaps there is something wrong with the vignette? Perhaps you can try to replicate the issue with the following?

usethis::create_from_github("rempsyc/rempsyc")
urlchecker::url_check()
rempsyc commented 10 months ago

Does urlchecker::url_check() checks the README file too or only vignettes? Because I got a failing URL that got caught by CRAN but not by R CMD check, so I am trying to find a way to capture those before I sent them to CRAN.

gaborcsardi commented 10 months ago

urlchecker checks the same URLs as R CMD check, in fact they are using the same code.

gaborcsardi commented 10 months ago

R CMD check does not check the URLs by default, you need to turn these checks on.

rempsyc commented 10 months ago

I see, thanks. I usually just use Ctrl+Shift+E, so don't specify any arguments. But looking at the documentation for devtools::check(), I don't see which argument is responsible for checking URLs. The ellipsis ... is passed to pkgbuild::build(), but nothing seems relevant there. Perhaps something to specify in args, but again the description does not mention this. I could not find how to specify it correctly from a Google search or through asking ChatGPT 4... Could you let me know how to "turn these checks on"? 😅 Perhaps I'm not using the right keywords...

Using R CMD check --help in the terminal, I can get options for what to specify for args, but nothing seems to specify URL checks?

gaborcsardi commented 10 months ago

This is how to check it with R CMD check from the command line:

_R_CHECK_CRAN_INCOMING_=true _R_CHECK_CRAN_INCOMING_REMOTE_=true R CMD check rempsyc
gaborcsardi commented 10 months ago

Btw. is this reproducible?

Do you also see it if you run urlchecker::url_check() from the package directory itself, without arguments?

rempsyc commented 10 months ago

Yes, it is reproducible. In my reprex, I had to specify the path for it to work, but when testing internally I don't use path (so ran from the package directory itself, without arguments) and get the same result.

For running R CMD check from the command line, this is what I get:

C:\github\rempsyc>_R_CHECK_CRAN_INCOMING_=true _R_CHECK_CRAN_INCOMING_REMOTE_=true R CMD check rempsyc
'_R_CHECK_CRAN_INCOMING_' is not recognized as an internal or external command, operable program or batch file.
gaborcsardi commented 10 months ago

Do you also see it if you run urlchecker::url_check() from the package directory itself, without arguments?

rempsyc commented 10 months ago

Do you also see it if you run urlchecker::url_check() from the package directory itself, without arguments?

As I said in my previous answer,

In my reprex, I had to specify the path for it to work, but when testing internally I don't use path (so ran from the package directory itself, without arguments) and get the same result.

And as I said in a previous answer, I encourage you to try to replicate it yourself, here are the steps again:

usethis::create_from_github("rempsyc/rempsyc")
urlchecker::url_check()

Btw even running devtools::check(args = "--as-cran") I get 0 error 0 warning 0 note.

gaborcsardi commented 10 months ago

I tried, cannot reproduce it myself.

devtools::check(args = "--as-cran") does not check the URLs, you need to set those env vars. E.g. use the env_vars argument of devtools::check if you can't use the command line. Maybe also set remote = TRUE.

EDIT: actually, maybe remote = TRUE is enough.

rempsyc commented 10 months ago

Ok thanks for the clarification! It worked with devtools::check(env_vars = c(NOT_CRAN = "false")) (then I get • NOT_CRAN : false). But with that I still get 0 error 0 warning 0 note, but yet again I also get no problems on CRAN.

I thought this would be replicable since I get the same thing on two different Windows 10 computers... Are you on Windows 10 as well? Because I just literally deleted the whole repo, used usethis::create_from_github("rempsyc/rempsyc") to create it from scratch, and then ran urlchecker::url_check() in the package root directory without arguments, and it's still stuck on this error.

rempsyc commented 10 months ago

Not a big deal though, not sure if it's worth spending more time on this bug, I just wanted to report it, but feel free to close this issue if it's too much trouble...

gaborcsardi commented 10 months ago

NOT_CRAN has nothing to do with this, you need to set the env vars I mentioned above. Or even simpler, set remote = TRUE.

You can put a dummy bad URL into README.md to make sure that it is checking the URLs.

rempsyc commented 10 months ago

Are you saying that devtools::check(env_vars = c(NOT_CRAN = "false")) is not the proper way to specify the en vars like you mentioned above?

In any case, devtools::check(remote = TRUE) has successfully detected the dummy bad URL into README.md (thanks!), but yet again—no problem with the vignette.

hezibu commented 2 months ago

Came across this issue is well. This also happens for me for URLs in a vignette file. I direct everyone to the OP:

Warning in file(con, "r"): cannot open file 'C:\github\rempsyc/C:/github/rempsyc/vignettes/assumptions.Rmd': 

For some reason, the path in the res variable within the url_check function lists URLs from vignettes differently. In my case, among other URLs (ignore the 403 message):

URL: https://esajournals.onlinelibrary.wiley.com/doi/10.1890/07-1904.1
From: C:/path/to/package/vignettes/basic_usage.Rmd
Status: 403
Message: Forbidden

URL: https://esajournals.onlinelibrary.wiley.com/doi/abs/10.1890/03-3102
From: man/sfestuary.Rd
Status: 403
Message: Forbidden

So I believe this causes an error downstream, maybe due to normalizePath(res)?