r-lib / rcmdcheck

Run R CMD check from R and collect the results
https://rcmdcheck.r-lib.org
Other
115 stars 27 forks source link

Error in Github workflow for old versions of R: `Calls: <Anonymous> -> new_rcmdcheck -> win2unix -> gsub` #140

Closed vincentarelbundock closed 3 years ago

vincentarelbundock commented 3 years ago

Hi,

Thanks for the package and all your work for the community. It is much appreciated!

I'm suddenly getting this error in Github workflows tests for old versions of R, but I couldn't find anything about it on SO or Google. The error below is from my countrycode repository. It seems to occur after the testthat suite has run its course: https://github.com/vincentarelbundock/countrycode

We made some minor changes to documentation formatting recently, but I couldn't find the culprit, even when I selectively rolled back some of those changes.

Feel free to close this issue if this is not relevant to you.

Running ‘test-all.R’ [15s/15s]
 OK
* DONE

Status: OK

Error in gsub("\r\n", "\n", str, fixed = TRUE) : 
  input string 1 is invalid in this locale
Calls: <Anonymous> -> new_rcmdcheck -> win2unix -> gsub
Execution halted
Error: Process completed with exit code 1.
vincentarelbundock commented 3 years ago

closing because the repo magically sorted itself out. Sorry for the useless notifications!

gaborcsardi commented 3 years ago

I think you had some problematic non-ASCII character in the manual. This is still a bug in rcmdcheck, though.

vincentarelbundock commented 3 years ago

This is quite plausible: countrycode is a package to convert country names in many languages, so there's plenty of unicode everywhere. Thanks for looking into it.

wmay commented 3 years ago

In the meantime, I was able to get it working by adding

Sys.setlocale('LC_ALL','C')

before running rcmdcheck. ASRCsoft/atmoschem.process@60a8ac6f280221e1ed9517d213ac68c8349b1d2e

jonkeane commented 3 years ago

I'm still trying to track down exactly what is causing this, but we ran into this gsub/locale issue as well in our CI. But hopefully this info will be helpful to you

Between these two runs nothing in the manual changed, the only difference I'm seeing is that in the run that fails there is the following in the R CMD CHECK output:

* checking pragmas in C/C++ headers and code ... NOTE
File which contains pragma(s) suppressing diagnostics:
  ‘src/dataset.cpp’

and when we fixed the underlying issue such that that is no longer a NOTE, the gsub error reported here went away (that check above turned to OK):

* checking pragmas in C/C++ headers and code ... OK

I have the full output below, but that seems to be the only real difference in the output. I'm surprised because there are other examples of that are parsed just fine.

Full logs/links to runs

fails (link: https://github.com/ursacomputing/crossbow/runs/2353713390?check_suite_focus=true#step:7:285):

── R CMD build ─────────────────────────────────────────────────────────────────
* checking for file ‘.../DESCRIPTION’ ... OK
* preparing ‘arrow’:
* checking DESCRIPTION meta-information ... OK
* cleaning src
* running ‘cleanup’
* installing the package to build vignettes
* creating vignettes ... OK
* cleaning src
* running ‘cleanup’
* checking for LF line-endings in source and make files and shell scripts
* checking for empty or unneeded directories
* building ‘arrow_3.0.0.9000.tar.gz’

── R CMD check ─────────────────────────────────────────────────────────────────
* using log directory ‘/arrow/r/check/arrow.Rcheck’
* using R version 4.0.5 Patched (2021-03-31 r80164)
* using platform: x86_64-pc-linux-gnu (64-bit)
* using session charset: UTF-8
* using options ‘--run-donttest --as-cran’
* checking for file ‘arrow/DESCRIPTION’ ... OK
* this is package ‘arrow’ version ‘3.0.0.9000’
* package encoding: UTF-8
* checking CRAN incoming feasibility ... NOTE
Maintainer: ‘Neal Richardson <neal@ursalabs.org>’

Version contains large components (3.0.0.9000)
* checking package namespace information ... OK
* checking package dependencies ... OK
* checking if this is a source package ... OK
* checking if there is a namespace ... OK
* checking for executable files ... OK
* checking for hidden files and directories ... OK
* checking for portable file names ... OK
* checking for sufficient/correct file permissions ... OK
* checking serialization versions ... OK
* checking whether package ‘arrow’ can be installed ... OK
* checking installed package size ... NOTE
  installed size is 38.4Mb
  sub-directories of 1Mb or more:
    libs  34.2Mb
    R      3.6Mb
* checking package directory ... OK
* checking for future file timestamps ... OK
* checking ‘build’ directory ... OK
* checking DESCRIPTION meta-information ... OK
* checking top-level files ... NOTE
  Unescaped left brace in regex is passed through in regex; marked by <-- HERE in m/
        # the exec should either be "eval"ed or a new statement
        (^\s*|\beval\s*[\'\"]|(;|&&|\b(then|else))\s*)

        # eat anything between the exec and $0
        exec\s*.+\s*

        # optionally quoted executable name (via $0)
        .?\$0.?\s*

        # optional "end of options" indicator
        (--\s*)?

        # Match expressions of the form '${1+$@}', '${1:+"$@"',
        # '"${1+$@', "$@", etc where the quotes (before the dollar
        # sign(s)) are optional and the second (or only if the $1
        # clause is omitted) parameter may be $@ or $*. 
        # 
        # Finally the whole subexpression may be omitted for scripts
        # which do not pass on their parameters (i.e. after re-execing
        # they take their parameters (and potentially data) from stdin
        .?(\${ <-- HERE 1:?\+.?)?(\$(\@|\*))?/ at /usr/local/bin/checkbashisms line 422, <IN> line 29.
  Unescaped left brace in regex is passed through in regex; marked by <-- HERE in m/
        # Match scripts which use "foo $0 $@ &\nexec true\n"
        # Program name
        \S+\s+

        # As above
        .?\$0.?\s*
        (--\s*)?
        .?(\${ <-- HERE 1:?\+.?)?(\$(\@|\*))?.?\s*\&/ at /usr/local/bin/checkbashisms line 448, <IN> line 29.
  Unescaped left brace in regex is passed through in regex; marked by <-- HERE in m/
        # the exec should either be "eval"ed or a new statement
        (^\s*|\beval\s*[\'\"]|(;|&&|\b(then|else))\s*)

        # eat anything between the exec and $0
        exec\s*.+\s*

        # optionally quoted executable name (via $0)
        .?\$0.?\s*

        # optional "end of options" indicator
        (--\s*)?

        # Match expressions of the form '${1+$@}', '${1:+"$@"',
        # '"${1+$@', "$@", etc where the quotes (before the dollar
        # sign(s)) are optional and the second (or only if the $1
        # clause is omitted) parameter may be $@ or $*. 
        # 
        # Finally the whole subexpression may be omitted for scripts
        # which do not pass on their parameters (i.e. after re-execing
        # they take their parameters (and potentially data) from stdin
        .?(\${ <-- HERE 1:?\+.?)?(\$(\@|\*))?/ at /usr/local/bin/checkbashisms line 422, <IN> line 20.
  Unescaped left brace in regex is passed through in regex; marked by <-- HERE in m/
        # Match scripts which use "foo $0 $@ &\nexec true\n"
        # Program name
        \S+\s+

        # As above
        .?\$0.?\s*
        (--\s*)?
        .?(\${ <-- HERE 1:?\+.?)?(\$(\@|\*))?.?\s*\&/ at /usr/local/bin/checkbashisms line 448, <IN> line 20.
* checking for left-over files ... OK
* checking index information ... OK
* checking package subdirectories ... OK
* checking R files for non-ASCII characters ... OK
* checking R files for syntax errors ... OK
* checking whether the package can be loaded ... OK
* checking whether the package can be loaded with stated dependencies ... OK
* checking whether the package can be unloaded cleanly ... OK
* checking whether the namespace can be loaded with stated dependencies ... OK
* checking whether the namespace can be unloaded cleanly ... OK
* checking loading without being on the library search path ... OK
* checking use of S3 registration ... OK
* checking dependencies in R code ... OK
* checking S3 generic/method consistency ... OK
* checking replacement functions ... OK
* checking foreign function calls ... OK
* checking R code for possible problems ... OK
* checking Rd files ... OK
* checking Rd metadata ... OK
* checking Rd line widths ... OK
* checking Rd cross-references ... OK
* checking for missing documentation entries ... OK
* checking for code/documentation mismatches ... OK
* checking Rd \usage sections ... OK
* checking Rd contents ... OK
* checking for unstated dependencies in examples ... OK
* checking line endings in shell scripts ... OK
* checking line endings in C/C++/Fortran sources/headers ... OK
* checking line endings in Makefiles ... OK
* checking compilation flags in Makevars ... OK
* checking for GNU extensions in Makefiles ... OK
* checking for portable use of $(BLAS_LIBS) and $(LAPACK_LIBS) ... OK
* checking use of PKG_*FLAGS in Makefiles ... OK
* checking use of SHLIB_OPENMP_*FLAGS in Makefiles ... OK
* checking pragmas in C/C++ headers and code ... NOTE
File which contains pragma(s) suppressing diagnostics:
  ‘src/dataset.cpp’
* checking compilation flags used ... OK
* checking compiled code ... OK
* checking installed files from ‘inst/doc’ ... OK
* checking files in ‘vignettes’ ... OK
* checking examples ... OK
* checking for unstated dependencies in ‘tests’ ... OK
* checking tests ...
  Running ‘testthat.R’ [39s/39s]
 OK
* checking for unstated dependencies in vignettes ... OK
* checking package vignettes in ‘inst/doc’ ... OK
* checking re-building of vignette outputs ... OK
* checking PDF version of manual ... OK
* checking for non-standard things in the check directory ... OK
* checking for detritus in the temp directory ... OK
* DONE

Status: 4 NOTEs
See
  ‘/arrow/r/check/arrow.Rcheck/00check.log’
for details.

Error in gsub("\r\n", "\n", str, fixed = TRUE) : 

fine (link: https://github.com/ursacomputing/crossbow/runs/2356493788?check_suite_focus=true):

── R CMD build ─────────────────────────────────────────────────────────────────
* checking for file ‘.../DESCRIPTION’ ... OK
* preparing ‘arrow’:
* checking DESCRIPTION meta-information ... OK
* cleaning src
* running ‘cleanup’
* installing the package to build vignettes
* creating vignettes ... OK
* cleaning src
* running ‘cleanup’
* checking for LF line-endings in source and make files and shell scripts
* checking for empty or unneeded directories
* building ‘arrow_3.0.0.9000.tar.gz’

── R CMD check ─────────────────────────────────────────────────────────────────
* using log directory ‘/arrow/r/check/arrow.Rcheck’
* using R version 4.0.4 (2021-02-15)
* using platform: x86_64-pc-linux-gnu (64-bit)
* using session charset: UTF-8
* using options ‘--run-donttest --as-cran’
* checking for file ‘arrow/DESCRIPTION’ ... OK
* this is package ‘arrow’ version ‘3.0.0.9000’
* package encoding: UTF-8
* checking CRAN incoming feasibility ... NOTE
Maintainer: ‘Neal Richardson <neal@ursalabs.org>’

Version contains large components (3.0.0.9000)
* checking package namespace information ... OK
* checking package dependencies ... OK
* checking if this is a source package ... OK
* checking if there is a namespace ... OK
* checking for executable files ... OK
* checking for hidden files and directories ... OK
* checking for portable file names ... OK
* checking for sufficient/correct file permissions ... OK
* checking serialization versions ... OK
* checking whether package ‘arrow’ can be installed ... OK
* checking installed package size ... NOTE
  installed size is 62.8Mb
  sub-directories of 1Mb or more:
    libs  58.6Mb
    R      3.6Mb
* checking package directory ... OK
* checking for future file timestamps ... OK
* checking ‘build’ directory ... OK
* checking DESCRIPTION meta-information ... OK
* checking top-level files ... NOTE
  Unescaped left brace in regex is passed through in regex; marked by <-- HERE in m/
        # the exec should either be "eval"ed or a new statement
        (^\s*|\beval\s*[\'\"]|(;|&&|\b(then|else))\s*)

        # eat anything between the exec and $0
        exec\s*.+\s*

        # optionally quoted executable name (via $0)
        .?\$0.?\s*

        # optional "end of options" indicator
        (--\s*)?

        # Match expressions of the form '${1+$@}', '${1:+"$@"',
        # '"${1+$@', "$@", etc where the quotes (before the dollar
        # sign(s)) are optional and the second (or only if the $1
        # clause is omitted) parameter may be $@ or $*. 
        # 
        # Finally the whole subexpression may be omitted for scripts
        # which do not pass on their parameters (i.e. after re-execing
        # they take their parameters (and potentially data) from stdin
        .?(\${ <-- HERE 1:?\+.?)?(\$(\@|\*))?/ at /usr/local/bin/checkbashisms line 422, <IN> line 29.
  Unescaped left brace in regex is passed through in regex; marked by <-- HERE in m/
        # Match scripts which use "foo $0 $@ &\nexec true\n"
        # Program name
        \S+\s+

        # As above
        .?\$0.?\s*
        (--\s*)?
        .?(\${ <-- HERE 1:?\+.?)?(\$(\@|\*))?.?\s*\&/ at /usr/local/bin/checkbashisms line 448, <IN> line 29.
  Unescaped left brace in regex is passed through in regex; marked by <-- HERE in m/
        # the exec should either be "eval"ed or a new statement
        (^\s*|\beval\s*[\'\"]|(;|&&|\b(then|else))\s*)

        # eat anything between the exec and $0
        exec\s*.+\s*

        # optionally quoted executable name (via $0)
        .?\$0.?\s*

        # optional "end of options" indicator
        (--\s*)?

        # Match expressions of the form '${1+$@}', '${1:+"$@"',
        # '"${1+$@', "$@", etc where the quotes (before the dollar
        # sign(s)) are optional and the second (or only if the $1
        # clause is omitted) parameter may be $@ or $*. 
        # 
        # Finally the whole subexpression may be omitted for scripts
        # which do not pass on their parameters (i.e. after re-execing
        # they take their parameters (and potentially data) from stdin
        .?(\${ <-- HERE 1:?\+.?)?(\$(\@|\*))?/ at /usr/local/bin/checkbashisms line 422, <IN> line 20.
  Unescaped left brace in regex is passed through in regex; marked by <-- HERE in m/
        # Match scripts which use "foo $0 $@ &\nexec true\n"
        # Program name
        \S+\s+

        # As above
        .?\$0.?\s*
        (--\s*)?
        .?(\${ <-- HERE 1:?\+.?)?(\$(\@|\*))?.?\s*\&/ at /usr/local/bin/checkbashisms line 448, <IN> line 20.
* checking for left-over files ... OK
* checking index information ... OK
* checking package subdirectories ... OK
* checking R files for non-ASCII characters ... OK
* checking R files for syntax errors ... OK
* checking whether the package can be loaded ... OK
* checking whether the package can be loaded with stated dependencies ... OK
* checking whether the package can be unloaded cleanly ... OK
* checking whether the namespace can be loaded with stated dependencies ... OK
* checking whether the namespace can be unloaded cleanly ... OK
* checking loading without being on the library search path ... OK
* checking use of S3 registration ... OK
* checking dependencies in R code ... OK
* checking S3 generic/method consistency ... OK
* checking replacement functions ... OK
* checking foreign function calls ... OK
* checking R code for possible problems ... OK
* checking Rd files ... OK
* checking Rd metadata ... OK
* checking Rd line widths ... OK
* checking Rd cross-references ... OK
* checking for missing documentation entries ... OK
* checking for code/documentation mismatches ... OK
* checking Rd \usage sections ... OK
* checking Rd contents ... OK
* checking for unstated dependencies in examples ... OK
* checking line endings in shell scripts ... OK
* checking line endings in C/C++/Fortran sources/headers ... OK
* checking line endings in Makefiles ... OK
* checking compilation flags in Makevars ... OK
* checking for GNU extensions in Makefiles ... OK
* checking for portable use of $(BLAS_LIBS) and $(LAPACK_LIBS) ... OK
* checking use of PKG_*FLAGS in Makefiles ... OK
* checking use of SHLIB_OPENMP_*FLAGS in Makefiles ... OK
* checking pragmas in C/C++ headers and code ... OK
* checking compilation flags used ... OK
* checking compiled code ... OK
* checking installed files from ‘inst/doc’ ... OK
* checking files in ‘vignettes’ ... OK
* checking examples ... OK
* checking for unstated dependencies in ‘tests’ ... OK
* checking tests ...
  Running ‘testthat.R’ [23s/23s]
 OK
* checking for unstated dependencies in vignettes ... OK
* checking package vignettes in ‘inst/doc’ ... OK
* checking re-building of vignette outputs ... OK
* checking PDF version of manual ... OK
* checking for non-standard things in the check directory ... OK
* checking for detritus in the temp directory ... OK
* DONE

Status: 3 NOTEs
See
  ‘/arrow/r/check/arrow.Rcheck/00check.log’
for details.

── R CMD check results ─────────────────────────────────── arrow 3.0.0.9000 ────
jennybc commented 3 years ago

I am seeing this intermittently locally as well.

Error in gsub("\r\n", "\n", str, fixed = TRUE) : 
  input string 1 is invalid in this locale
Calls: <Anonymous> ... force -> <Anonymous> -> new_rcmdcheck -> win2unix -> gsub
Execution halted
devtools::session_info("rcmdcheck")
#> ─ Session info ───────────────────────────────────────────────────────────────
#>  setting  value                       
#>  version  R version 4.0.2 (2020-06-22)
#>  os       macOS Catalina 10.15.7      
#>  system   x86_64, darwin17.0          
#>  ui       X11                         
#>  language (EN)                        
#>  collate  en_CA.UTF-8                 
#>  ctype    en_CA.UTF-8                 
#>  tz       America/Vancouver           
#>  date     2021-04-23                  
#> 
#> ─ Packages ───────────────────────────────────────────────────────────────────
#>  package     * version    date       lib source           
#>  callr         3.6.0      2021-03-28 [1] standard (@3.6.0)
#>  cli           2.4.0.9000 2021-04-22 [1] local            
#>  crayon        1.4.1      2021-02-08 [1] standard (@1.4.1)
#>  desc          1.3.0      2021-03-05 [1] standard (@1.3.0)
#>  digest        0.6.27     2020-10-24 [1] CRAN (R 4.0.2)   
#>  glue          1.4.2      2020-08-27 [1] CRAN (R 4.0.2)   
#>  pkgbuild      1.2.0      2020-12-15 [1] CRAN (R 4.0.2)   
#>  prettyunits   1.1.1      2020-01-24 [1] CRAN (R 4.0.0)   
#>  processx      3.5.0      2021-03-23 [1] standard (@3.5.0)
#>  ps            1.6.0      2021-02-28 [1] standard (@1.6.0)
#>  R6            2.5.0      2020-10-28 [1] CRAN (R 4.0.2)   
#>  rcmdcheck     1.3.3.9000 2020-12-08 [1] local            
#>  rprojroot     2.0.2      2020-11-15 [1] CRAN (R 4.0.2)   
#>  sessioninfo   1.1.1      2018-11-05 [1] CRAN (R 4.0.0)   
#>  withr         2.4.1      2021-01-26 [1] standard (@2.4.1)
#>  xopen         1.0.0      2018-09-17 [1] CRAN (R 4.0.0)   
#> 
#> [1] /Users/jenny/Library/R/4.0/library
#> [2] /Library/Frameworks/R.framework/Versions/4.0/Resources/library

Created on 2021-04-23 by the reprex package (v2.0.0.9000)

turgut090 commented 3 years ago

This started happening for Linux. For Mac os and Windows, Github Actions do not fail with this error.

Fablepongiste commented 3 years ago

This is also happening with recent version of R, 4.0.2 for example, on Linux.

Error in gsub("\r\n", "\n", str, fixed = TRUE) : 
  input string 1 is invalid in this locale
Calls: <Anonymous> ... force -> <Anonymous> -> new_rcmdcheck -> win2unix -> gsub

What is particularly annoying is that it seems to happen at random.

Is it being looked at ?

gaborcsardi commented 3 years ago

Is it being looked at ?

I suggest you use the workaround above on GHA, until a new version of rcmdcheck gets to CRAN.

infotroph commented 3 years ago

I believe this is an issue in processx that has been fixed in the development version: https://github.com/r-lib/processx/issues/298

gaborcsardi commented 3 years ago

@infotroph Oh, right, that would make a lot sense indeed.

We can still leave this open, we can make that gsub() call more robust, and if we do not emit \r\n line endings, then we don't even need it, actually.