wvengen / Rlgi

R interface to the Leiden Grid Infrastructure
http://wiki.biggrid.nl/wiki/index.php/LGI/Rlgi
2 stars 0 forks source link

README

The Rlgi package is an integration of the R Programming languages that allows R users to programatically integrate their R models with a Leiden Grid Infrastructure.

This implementation of Rlgi is an adaptation of the Rsge package (which is based on the Rlsf package, etc..). Thanks to all the authors of these package to make this possible.

The program functions as follows:

  1. call lgi.par(L|s|C|R)Apply(X, fun, ..., join.method=cbind, njobs) or lgi.apply
  2. The data object is split as specified by njobs or batch.size
  3. Each data segment along with the function, arguemnt list (optionally global variables and library names) are saved to a shared network location using the save call.
  4. Each worker process loads the saved file, creates the new environment, executes the saved function call (FUN), and saves the results to the shared network location.
  5. The master script uses join.method to merge the results together and returns the result.

(lgi.submit is similar, but does not split data and is asynchronous)

IMPORTANT NOTE: This is no more than a prototype version that works with simple cases. Someone with some R knowledge should probably make it more robust (large datasets, for example).

LICENSE

This code is licensed under GPL, use it, modify it, change it, but most of all, enjoy it.

INSTALLATION

To install this package, first ensure that snow and RCurl are installed, then run the following.

R CMD INSTALL Rlgi

This package requires that the LGI user configuration is setup in ~/LGI.cfg or ~/.LGI. Please refer to QUICKSTART and the documentation for lgi.readConfig() for more information.

Nodes where this package is installed are referred to as submit nodes. Nodes where this package will distribute data for computation are referred to as compute nodes.

SubmitNode:

Usage

To get more information about how to use this package, either see:

Configuration:

For more on global configuration, see help(lgi.options)

Example session

library(Rlgi) Loading required package: snow Loading required package: XML Welcome to Rlgi Version: 0.0.7

set defaults, when needed

options(lgi.application='R-2.12')

non-cluster-aware call

sapply(c(10:20), function(x){x^3}) [1] 1000 1331 1728 2197 2744 3375 4096 4913 5832 6859 8000

use Rlgi's function to do sapply, but run locally for testing

lgi.parSapply(c(10:20), function(x){x^3}, cluster=FALSE) [Rlgi] Running locally [1] 1000 1331 1728 2197 2744 3375 4096 4913 5832 6859 8000

now run remotely, blocking until all computations are finished

lgi.parSapply(c(10:20), function(x){x^3}) [Rlgi] Submitting 1 jobs... [Rlgi] All jobs submitted, waiting for completion. [Rlgi] Waiting for jobs: 0 queued; 0 running; 1 other
[1] 1000 1331 1728 2197 2744 3375 4096 4913 5832 6859 8000