Closed jcheng5 closed 7 months ago
Hi Joe, this is the direct result of mirai doing this, as documented in mirai()
:
The expression '.expr' will be evaluated in a separate R process in a clean environment, which is not the global environment, consisting only of the named objects passed as '...' and/or the list supplied to '.args'.
The point of interest being - not the global environment. Hence it requires that func_a
be passed as an argument of func_b
, or else func_a
explicitly assigned to .GlobalEnv
in which case it will be picked up.
The reasoning behind this: (i) lowers the chance of accidental global variables left behind in .GlobalEnv
if cleanup
is not performed, and (ii) slightly opinionated take that this is better practice.
I do see the benefit to allowing this, and at one point I was erring towards having the ...
arguments assigned to .Globalenv
, but not the variables passed via .args
- so as to be able to differentiate.
Do you think it would be useful to meet this expectation? @wlandau had a preference for the current behaviour for crew
users, but they would be more of a niche bunch I guess.
I do see the benefit to allowing this, and at one point I was erring towards having the ... arguments assigned to .Globalenv, but not the variables passed via .args - so as to be able to differentiate.
I found the original discussion for this: #55 - I raised this myself nearly one year ago.
I've now proceeded to implement this in 647e65d: '...' arguments go to .GlobalEnv
in the daemon process, '.args' arguments stay local. Best of both worlds. This is also very unlikely to be breaking (apart from as yet unidentified corner cases).
This means that the more casual usage such as your example above now 'just works'.
I've also updated the examples, as it is now possible to source()
script files with or without local = TRUE
.
@wlandau please note this change. All crew
tests continue to pass, including the manual ones, but please check on your side as well and it may allow a more efficient mapping to your data
and globals
arguments.
Would it be possible to add an alternative argument to specify local variables? I feel this is important for efficient memory management in crew
.
controller$push()
calls mirai::mirai()
as below:
task <- mirai::mirai(
.expr = expr_crew_eval,
name = name,
command = command,
string = string,
data = data,
globals = globals,
seed = seed,
algorithm = algorithm,
packages = packages,
library = library,
.timeout = .timeout,
.compute = private$.client$name
)
crew
's documentation currently promises users that the data
argument is for local variables and only globals
are assigned to the global environment.
From https://github.com/shikokuchuo/mirai/discussions/55#discussioncomment-9059713, it should be enough to switch crew
to using .args
.
Switched to .args
in https://github.com/wlandau/crew/commit/bb75dd76b5697b9142778d1afe9f826028a9af99. All tests pass before and after using development nanonext
/mirai
.
Note that when using ...
for this, you will not be able to use "globals" with names that are the same as the arguments of mirai::mirai()
, i.e.
> names(formals(mirai::mirai))
[1] ".expr" "..." ".args" ".timeout" ".compute"
Maybe something like:
Would it be possible to add an alternative argument to specify local variables?
with .locals
and .globals
would be better and would avoid the above name-clash problem.
Thanks Henrik, that’s noted. The ‘…’ interface has been there from day 1 and not about to change. The mitigants are the minimal nature of the interface and the '.' prefixes to the arguments.
The only reason the fix for this issue was so straightforward is because it is completely back-compatible.
I was expecting the following to work, but it gives the error
'miraiError' chr Error in func_a(): could not find function "func_a"
Regardless of whether this particular incantation should work or not, is this a scenario you think of as within scope for mirai? I saw in
?mirai
that you have some examples where you domirai(source(...), local=TRUE)
, is that what you were envisioning people would do if they have code broken into functions but not in a package?