Closed HenrikBengtsson closed 7 years ago
You are right, I am too restrictive regarding variable names. I always forget about the infix operators.
The reasoning for not storing the variable inside the RDS is that you have to read all files just to get the names, i.e. each time you call batchExport
.
make.names()
is not bijective, I would need something like a base32 encoder/decoder.
I guess I'll just store the real names in the registry and use a random string for the file names.
I've changed my mind. I think I will go with base32. See here: https://github.com/mllg/batchtools/pull/89
This is good news.
The reasoning for not storing the variable inside the RDS is that you have to read all files just to get the names, i.e. each time you call
batchExport
.
I didn't consider that batchtools might need to see what's already exported.
FYI, I've used utils::URLencode()
/ utils::URLdecode()
in future.BatchJobs as a workaround for a long time now and it's been working very well. Haven't had any issues. My point is, I think your approach will work without surprises.
base32 encoding is merged. Should now work with all object names.
Awesome. I've verified this with makeClustersFunctionInteractive(external = TRUE)
and makeClustersFunctionTORQUE()
.
Background
It's not possible to export objects with other than "basic" names (= "according to R's variable naming rules"). For example,
gives
User's can somewhat workaround this by exporting as
batchExport(export = list(my_exports = objs))
and then add something likemapply(names(exports), exports, FUN = assign, envir = .GlobalEnv)
but that's tedious, error prone, and not very efficient.Simple suggestion: Reference / define "R's variable naming rules"
It's probably not fully clear to the user what "R's variable naming rules" is. I quickly tried to find a formal reference, but "all" I found was from
?make.names
:If that's your definition, I think it would be useful to reference that in from
?checkmate::assertList
but also from?batchtools::batchExport
.I think it would also be helpful to give some examples of valid and non-valid names in
?batchtools::batchExport
. This would lower the risk for run-time surprises.Actual suggestion: Why not support all variable names?
One might want to export objects with "non-standard variable names", e.g. locally defined infix or operator functions, e.g.
`%my_op%`
orc_{i}
. Although these are unusual, they are certainly not unlikely.As far as I understand it, the main reason for the current naming restriction is that the name of the exported object is restricted to what can be represented in a filename (roughly POSIX minus some symbols). This comes from
loadRegistryDependencies()
:However, if the name would be encoded in the actual filename, but instead be stored in the saved object, then this wouldn't be a limitation, e.g.
I understand the history / legacy / convenience of having file names reflect the variable name, but is it really needed. Could you use a unique hash code for filename instead (as it looks like you're doing in the
logs
(?) directory. Alternatively, they could be incremental indices, or do something likebase::make.names(names, unique = TRUE)
allowing for some information in the filenames.See also
This looks identical to what's going on in BatchJobs, cf. https://github.com/tudo-r/BatchJobs/issues/93
PS. Much of my feedback is now coming in as I've started to implement the future.batchtools wrapper and running initial tests on batchtools via that framework. Please, treat my issues / feedback / suggestions as what they are and feel free to discuss and say you disagree. You're actions / reply should not be the same regardless of me being a JOSS reviewer or not. I'm pretty sure you're doing just that, but I wanted to make it explicit it does not affect the JOSS review.