Closed viking closed 6 years ago
This output appears consistent with the rchk documentation (emphasis added):
Note that sometimes the tool can give surprising annotations. A particular case is that it conservatively assumes that any call to external code may allocate from the R heap, which is often not the case.
Thanks, Joshua, that's a perfect answer - this is probably a false alarm for this reason and hence safe to ignore (if you're sure the function is external and would never call back into any function from R, which I assume is the case here, please ignore this report).
Thanks for your reply. I'm happy to ignore the warning, however, the CRAN maintainers seem to be relying on rchk for package submissions. I want to avoid having to make a case each time I need to update the yaml package.
I could simply add extraneous PROTECT
and UNPROTECT
calls before and after using 3rd party library functions. I thought it best to open an issue here to see if the rchk tool could be improved in order to avoid this kind of thing. I don't want to add extra code unnecessarily.
Note that R_yoink
is an allocating function which may also be called when names
is in scope, to fix that warning you would have to protect anyway, and this will make the false alarms go away (it is common to protect when you allocate and unprotect when it is clear the variable may no longer be used, so you'll cover the calls to external functions incorrectly deemed as allocating, anyway).
The tool cannot do much about external code it does not know. It could make the opposite guess -- assume that external code does not allocate, but it would then miss real errors in other cases.
If you were curious, you can find more about the tool at:
https://github.com/kalibera/cran-checks/blob/master/rchk/PROTECT.md https://github.com/kalibera/rchk/blob/master/README.md (and linked documents)
In this case, the third party code is actually packaged up inside the R package. The rchk tool has access to it for analysis.
Isn't that in the YAML library? (external)
The YAML library is built alongside the R extension:
https://github.com/viking/r-yaml/tree/master/pkg/src
The source files with names that are not prefixed with r_
belong to libyaml. This is currently necessary because there's a patch.
I see, so I've traced the detailed decisions of the tool in this case and it treats yaml_emitter_flush
as an allocating function, because this function calls another function via a function pointer emitter->write_handler
. The tool conservatively assumes that calls via a function pointer allocate. The tool does not know which function will be called, so it can't do better and this is by design. Extending the analysis so that the tool would know which functions are called via pointers will be too complicated (and won't work in most cases anyway). The tool treats yaml_emitter_emit
as an allocating function, because it (transitively) calls into yaml_emitter_flush
.
Okay, that makes sense. I was able to silence the warnings by adding more GC protection. I was under the impression that the names
attribute was protected under the umbrella of a protected object, that's why I didn't protect it before.
Thanks! Re names
, it depends on the type of the object, sometimes the names
attribute is implicitly protected (e.g. currently for vectors, vector lists, etc), but sometimes not (an obvious example are objects represented using pairlists). I think one should protect names
always to make the code robust to future changes in R (e.g. need for some kind of a new conversion to retrieve attributes) or changes in the application. Particularly with the R API which does not have explicit representation of R types (just SEXP
), it is often hard to see on which object types getAttrib
can be called at particular code location. rchk
in some cases can derive these types and won't issue a warning, but usually it can't and it would be too difficult to support that. Doing so manually is often very hard as well, as it involves whole program analysis.
In the yaml package, rchk lists the following problems:
https://raw.githubusercontent.com/kalibera/cran-checks/master/rchk/results/yaml.out
It notes that
yaml_emitter_emit
is an allocation function that could cause R to run garbage collection and therefore destroy an unprotected object. However, this function does not use any R functions whatsoever. Does rchk look for anymalloc
call or just R-specific allocation functions?Thanks in advance.