Add an exported, documented function to RHIPE, call it maybe rhipeDebugInfo() that runs a MapReduce job that returns for each map task a list with the following:
hostname: host name of the machine - I suppose through something like system("hostname", intern = TRUE)
user: name of the user running the R process - use something like system("whoami", intern = TRUE)
env: a list of environment variables, through Sys.getenv()
libPaths: the result of calling .libPaths()
sessionInfo: the result of calling sessionInfo()
Anything else?
The trick is to make it run a dummy job that we can ensure will run on all nodes of the cluster, but only a small number (preferably one) task per node.
This will be a useful function to expose to any RHIPE user to be able to run to figure out why something might not be working, or to provide details about their system.
Add an exported, documented function to RHIPE, call it maybe
rhipeDebugInfo()
that runs a MapReduce job that returns for each map task a list with the following:hostname
: host name of the machine - I suppose through something likesystem("hostname", intern = TRUE)
user
: name of the user running the R process - use something likesystem("whoami", intern = TRUE)
env
: a list of environment variables, throughSys.getenv()
libPaths
: the result of calling.libPaths()
sessionInfo
: the result of callingsessionInfo()
Anything else?
The trick is to make it run a dummy job that we can ensure will run on all nodes of the cluster, but only a small number (preferably one) task per node.
This will be a useful function to expose to any RHIPE user to be able to run to figure out why something might not be working, or to provide details about their system.