ropensci / bibtex

bibtex parser for R
https://docs.ropensci.org/bibtex/
35 stars 12 forks source link

Fix rchk issues #40

Closed matthewwiese closed 1 year ago

matthewwiese commented 2 years ago

Fixes for the the record* functions, rchk's previous output:

Function recordComment                                                                                                                  
  [PB] has negative depth /rchk/packages/build/QSR33JHP/bibtex/src/bibparse.y:994                                                       
  [PB] has possible protection stack imbalance /rchk/packages/build/QSR33JHP/bibtex/src/bibparse.y:995                                  

Function recordInclude                                                                                                                  
  [PB] has negative depth /rchk/packages/build/QSR33JHP/bibtex/src/bibparse.y:986                                                       
  [PB] has possible protection stack imbalance /rchk/packages/build/QSR33JHP/bibtex/src/bibparse.y:987                                  

Function recordPreamble                                                                                                                 
  [PB] has negative depth /rchk/packages/build/QSR33JHP/bibtex/src/bibparse.y:1008                                                      
  [PB] has possible protection stack imbalance /rchk/packages/build/QSR33JHP/bibtex/src/bibparse.y:1009                                 

Function recordString                                                                                                                   
  [PB] has negative depth /rchk/packages/build/QSR33JHP/bibtex/src/bibparse.y:1001                                                      
  [PB] has possible protection stack imbalance /rchk/packages/build/QSR33JHP/bibtex/src/bibparse.y:1002   

In addition, some fixes for xx_atobject_entry:

Function xx_atobject_entry
  [UP] unprotected variable object while calling allocating function Rf_allocVector /rchk/packages/build/QSR33JHP/bibtex/src/bibparse.y:543
  [UP] unprotected variable object while calling allocating function Rf_allocVector /rchk/packages/build/QSR33JHP/bibtex/src/bibparse.y:547
  [UP] unprotected variable object while calling allocating function Rf_install /rchk/packages/build/QSR33JHP/bibtex/src/bibparse.y:550
  [UP] unprotected variable object while calling allocating function Rf_setAttrib(V,S:entry,V) /rchk/packages/build/QSR33JHP/bibtex/src/bibparse.y:550
  [UP] unprotected variable object while calling allocating function Rf_install /rchk/packages/build/QSR33JHP/bibtex/src/bibparse.y:551
  [UP] unprotected variable object while calling allocating function Rf_setAttrib(V,S:names,V) /rchk/packages/build/QSR33JHP/bibtex/src/bibparse.y:551
  [UP] unprotected variable object while calling allocating function Rf_install /rchk/packages/build/QSR33JHP/bibtex/src/bibparse.y:552
  [UP] unprotected variable object while calling allocating function Rf_setAttrib(V,S:key,V) /rchk/packages/build/QSR33JHP/bibtex/src/bibparse.y:552
  [PB] has negative depth /rchk/packages/build/QSR33JHP/bibtex/src/bibparse.y:573
  [PB] has possible protection stack imbalance /rchk/packages/build/QSR33JHP/bibtex/src/bibparse.y:575

is now only:

Function xx_atobject_entry
  [PB] has possible protection stack imbalance /rchk/packages/build/T061pscJ/bibtex/src/bibparse.y:572

The remaining error is shared by all xx_atobject_* functions. It's related to this pattern of PROTECTing ans before returning:

_PROTECT( ans = R_NilValue ); 
return ans ;

I'm still studying the code, as well as the Writing R Extensions manual, and so don't immediately have an idea for a fix.

coatless commented 2 years ago

@matthewwiese this is great! Thanks for taking a deeper look into this. I'll try to give more detailed feedback on Wednesday or Friday of this week.

Again, thank you!

matthewwiese commented 2 years ago

No rush! My own bandwidth is limited at the moment due to work + evening classes.

I mentioned in #41 my own puzzlement in terms of tooling. I don't want to make any assumptions about how the development for this package is done, but perhaps updating tooling (among other things) could be a parallel effort to fixing the rchk issues? It might make it easier to maintain good status in CRAN in the future, too.

matthewwiese commented 2 years ago

I've gone through the bison parser and refactored it quite a bit. This fixes all rchk issues that were reported in #33. In addition, normal tests - via devtools::check() - all pass. The rchk log output:

``` source("/rchk/scripts/utils.r"); install_package_libs("/rchk/packages/bibtex_0.4.3.tar.gz") * installing *source* package 'bibtex' ... not using staged install with --libs-only ** using non-staged installation ** libs wllvm -I"/rchk/build/lib/R/include" -DNDEBUG -I../inst/include/ -I/usr/local/include -fPIC -Wall -g -O0 -fPIC -c biblex.c -o biblex.o :812:13: warning: misleading indentation; statement is not part of the previous 'if' [-Wmisleading-indentation] if ( ! (yy_state_buf) ) ^ :810:9: note: previous statement is here if ( ! (yy_state_buf) ) ^ biblex.l:567:2: warning: misleading indentation; statement is not part of the previous 'if' [-Wmisleading-indentation] const char * token = (const char*)&yytext[0] ; ^ biblex.l:562:5: note: previous statement is here if (!((t == TOKEN_INLINE) || ^ 2 warnings generated. :812:13: warning: misleading indentation; statement is not part of the previous 'if' [-Wmisleading-indentation] if ( ! (yy_state_buf) ) ^ :810:9: note: previous statement is here if ( ! (yy_state_buf) ) ^ biblex.l:567:2: warning: misleading indentation; statement is not part of the previous 'if' [-Wmisleading-indentation] const char * token = (const char*)&yytext[0] ; ^ biblex.l:562:5: note: previous statement is here if (!((t == TOKEN_INLINE) || ^ 2 warnings generated. wllvm -I"/rchk/build/lib/R/include" -DNDEBUG -I../inst/include/ -I/usr/local/include -fPIC -Wall -g -O0 -fPIC -c bibparse.c -o bibparse.o wllvm -I"/rchk/build/lib/R/include" -DNDEBUG -I../inst/include/ -I/usr/local/include -fPIC -Wall -g -O0 -fPIC -c init.c -o init.o wllvm -I"/rchk/build/lib/R/include" -DNDEBUG -I../inst/include/ -I/usr/local/include -fPIC -Wall -g -O0 -fPIC -c stretchyList.c -o stretchyList.o wllvm -shared -L/usr/local/lib -o bibtex.so biblex.o bibparse.o init.o stretchyList.o installing to /rchk/packages/libsonly/bibtex/libs * DONE (bibtex) Installed libraries of package bibtex [1] "/rchk/packages/libsonly/bibtex/libs/bibtex.so" Library name (usually package name): bibtex Initialization function: R_init_bibtex Functions: 1 Checked call to R_registerRoutines: 1 ERROR: too many states (abstraction error?) in function strptime_internal Analyzed 104 functions, traversed 1634 states. Rchk version: 3d653b7c8f92dac912532856b55f44d2986c6553 R version: 79889/R Under development (unstable) (2021-01-27 r79889) LLVM version: 10.0.0 ```

The ERROR: too many states (abstraction error?) in function strptime_internal confused me, as I couldn't find a use of strptime_internal in this codebase. Turned out it's related to R's internals; this StackExchange question provides an explanation.

A detail that requires more investigation, however, is this error I encounter when running with garbage collection "torture" (R CMD check --use-gct ./packages/bibtex_0.4.3.tar.gz):

* checking examples ... ERROR
Running examples in ‘bibtex-Ex.R’ failed
The error most likely occurred in:

> ### Name: read.bib
> ### Title: bibtex parser
> ### Aliases: read.bib
> 
> ### ** Examples
> 
> ## this package has a REFERENCES.bib file
> bib <- read.bib( package = "bibtex" )
Error: STRING_ELT() can only be applied to a 'character vector', not a 'char'

It doesn't occur for me in the base repo - mostly because the * checking examples ... line still has yet to complete...

EDIT: gctorture on the base repo passes * checking examples ... with OK leading me to believe there are assumptions being made regarding the use of STRING_ELT() that I have not considered. More investigation required.

coatless commented 2 years ago

@matthewwiese just updated the GitHub action's trigger. I think this should cause the runner to start shortly in the PR.

coatless commented 2 years ago

Please pull in the new updates to master into your branch with:

git fetch --all 
git merge origin/master
coatless commented 2 years ago

@matthewwiese would it be okay if we moved 9e3e606 into a separate PR that deals with upgrading bison?

From #41, the package is designed to be inclusive using a local source approach instead of depending on system libraries. (Hence, we have a bit of an outdated library inside.)

coatless commented 2 years ago

@matthewwiese regarding STRING_ELT(), this was briefly answered by the old version of me here:

https://stackoverflow.com/questions/48666925/meaning-of-string-elt-in-rcpp

I think for a deeper dive, please check hadley's r-internals for a better inner view, specifically:

https://github.com/hadley/r-internals/blob/master/strings.md

matthewwiese commented 2 years ago

Please pull in the new updates to master into your branch

Done!

would it be okay if we moved 9e3e606 into a separate PR that deals with upgrading bison?

Sure thing - I've reverted that commit. Thanks for the clarification.

this was briefly answered by the old version of me here ... I think for a deeper dive, please check hadley's r-internals

That SE answer was quite helpful, and thanks for linking the source of it too. I overlooked that part in the R Extensions manual. I'll also take a look at Hadley's resource you linked, as well.

coatless commented 2 years ago

@matthewwiese I switched the PR branch from merging into rchk-fixes to master. I think this should hopefully trigger the action. If not, I'm a bit stumped.

matthewwiese commented 2 years ago

Strange. Are you able to trigger a workflow manually, or is that not registering too? I haven't a clue since I've only used Actions for simple things myself.

Related to the STRING_ELT confusion, I've tracked down the cause of the error (or at least the first situation which triggers it). This code within xx_atobject_entry:

for( i=0; i<n; i++){
    SET_STRING_ELT( ans  , i, STRING_ELT(CAR(o),0) ) ;
    SET_STRING_ELT( names, i, STRING_ELT(getAttrib(CAR(o), install("names")),0) ) ;
    o = CDR(o ) ;
}

Apparently CAR(o) is returning a CHARSXP when using gctorture. A naive guess was to add the PROTECT_WITH_INDEX/REPROTECT code back in but that does not appear to make a difference. I'll chew on this.

coatless commented 1 year ago

@matthewwiese I'm going to close this PR as we've opted to go the way of an internal R parser. Thank you again for taking a close look at it awhile back.