Closed kingaa closed 5 years ago
In principle, you can ignore these warnings from the tool, it just tells the tool cannot understand this code properly. In this case the report is from [UP]
, which is checking for individual unprotected variables, and that part of the tool cannot handle loops where the protection stack depth increases with every loop iteration. Other parts of the tool, e.g. checking for protection balance ([PB]
), can do it. I am afraid that the type of modelling that [UP]
is doing cannot be fixed to support this pattern.
But as you showed this code, please note that the unbounded protection is not necessary. Only the head of a pairlist, in this example of args
, needs to be protected at any time. You can use REPROTECT
to achieve that or you can have a dummy cons-cell (an extra stable head of the pairlist) that will stay protected for the whole duration of the loop, and args
would be linked to that dummy cell. Note also that var
does not have to be protected as soon as it gets into a protected cons cell (protected via args
), so it does not have to stay explicitly protected after the body loop exits. Btw also note the function installChar
. Typically, nprotect
is used in code where it is cognitively too diffucult to track what is the stack depth, but the stack depth is still constant. But of course if you know that the stack depth increase will be very small, that the length of names
will be very small, the code should work fine.
@kalibera: thanks for the timely response and the useful tips.
It might be too much to ask (just say so if it is), but I wonder if you could show me, in the context of the above snippet of C code, how to use the REPROTECT facility or the dummy cons cell approach. Again, no worries if you're too busy, etc.
No problem, you can use REPROTECT
like this:
PROTECT_INDEX px;
PROTECT_WITH_INDEX(args = LCONS(AS_LOGICAL(log),args), &px);
SET_TAG(args,install("log"));
for (v = LENGTH(names)-1; v >= 0; v--) {
PROTECT(var = NEW_NUMERIC(1));
REPROTECT(args = LCONS(var,args), px);
UNPROTECT(1); /* var */
SET_TAG(args,installChar(STRING_ELT(names,v)));
}
UNPROTECT(1); /* args */
See Writing R Extensions for more details.
In principle you don't have to protect var
, because LCONS
is callee-protect, so that the code can be simplified further. But there is no harm protecting it, and one then does not have to worry which functions are callee-protect:
PROTECT_INDEX px;
PROTECT_WITH_INDEX(args = LCONS(AS_LOGICAL(log),args), &px);
SET_TAG(args,install("log"));
for (v = LENGTH(names)-1; v >= 0; v--) {
var = NEW_NUMERIC(1);
REPROTECT(args = LCONS(var,args), px);
SET_TAG(args,installChar(STRING_ELT(names,v)));
}
UNPROTECT(1); /* args */
Anyway, with that assumption if you wanted to avoid REPROTECT
, you could also do this:
PROTECT(args = LCONS(AS_LOGICAL(log),args));
SET_TAG(args,install("log"));
for (v = LENGTH(names)-1; v >= 0; v--) {
var = NEW_NUMERIC(1);
args = LCONS(var,args);
UNPROTECT(1); /* old args */
PROTECT(args);
SET_TAG(args,installChar(STRING_ELT(names,v)));
}
UNPROTECT(1); /* args */
Excellent! Thanks so much, @kalibera! This will jump-start me in simplifying the code.
Say, do you know what the performance overhead associated with PROTECT, UNPROTECT, and REPROTECT is?
I'd say that unbounded protection due to a loop increasing the protection stack depth is always good to avoid. For the remaining, the performance difference between REPROTECT
ing the item on the very top of the stack and a call to UNPROTECT(x); PROTECT(x)
is so small compared to other performance costs in any program, that one should choose the variant that makes the code more readable. In this case it may be a matter of taste, I myself prefer the UNPROTECT(x); PROTECT(x)
. Other than the unbounded protection, I don't think that in a user program one would ever get to a situation when the direct execution time overhead of PROTECT
, UNPROTECT
and REPROTECT
would be relevant, so the only constraints should be correctness, robustness and readability of the code.
Running rchk on my package pomp, I get several messages like the following.
These arise in constructions such as the following (this one is the relevant code chunk from dprior.c mentioned above).
As you can see, this function is designed to take a arbitrary-length vector of variable names and construct a call from it. This is extremely useful and, I believe, totally legitimate.
My question is as to how I should interpret the message thrown by rchk. I suspect they reflect the inability of current rchk codes to follow the loop of arbitrary length. Is this correct? If so, do you have any plans or see any way to extend rchk's capacity in this way?