Azure / doAzureParallel

A R package that allows users to submit parallel workloads in Azure
MIT License
107 stars 51 forks source link

Any way to use Forecast 8.0 package #147

Closed QuantMonk closed 7 years ago

QuantMonk commented 7 years ago

Really love this package as its speeds up my testing like crazy. But I am really stuck with the Microsoft R Open/MRAN default. For instance I need few methods in the latest Forecast 8.0 package (available on CRAN) for a piece of code to work properly. However I see that the Microsoft R Open defaults to older version of Forecast which causes my code to break. This is a real bummer. Even though under the cluster config json file I listed 'forecast' under CRAN but seems like its pulling an older version (7.x)

Any easier way to deal with this issue? What would be the easiest way to work with CRAN packages. Is there any way we can work with standard R version (not the Microsoft R Open version). I have nothing against Microsoft (I do work there :)), but our entire code base is dependent on CRAN version of R. Could you please provide some guidelines in Troubleshooting/FAQ page (a little bit elaborate than a single line please)

paselem commented 7 years ago

Hello @QuantMonk First off thanks for the feedback! We really appreciate it.

We completely understand your issue and are currently working on a feature to help address it using Docker Containers to allow users to bring whichever version of R the care to use - by default we'll be using RStudio's rocker/tidyverse container which uses CRAN R as well as adds several common RStudio packages such as devtools, tidyverse and more. We are finishing up the development of this feature but it's not 100% tested yet. Mostly it seems to work as expected and performance is looking good, but we still have a few more tests to work though before we sigh off.

Please feel free to try it out by simply pulling doAzureParallel from the container branch and updating your cluster (see below).

I hope this helps unblock you, and we are working to get this as the default version of doAzureParallel quite soon.

Update your cluster.json file to include a pointer to the containerImage you want to use. If not specified, it will use rocker/tidyverse:latest (which I believe is at 3.4.2). In the example below, I pull the container which has CRAN R 3.4.1 in it.

{
  "name": "myPool",
  "vmSize": "Standard_F2",
  "maxTasksPerNode": 1,
  "poolSize": {
    "dedicatedNodes": {
      "min": 0,
      "max": 0
    },
    "lowPriorityNodes": {
      "min": 1,
      "max": 1
    },
    "autoscaleFormula": "QUEUE"
  },
  "containerImage": "rocker/tidyverse:3.4.1",
  "rPackages": {
    "cran": [],
    "github": [],
    "bioconductor": [],
    "githubAuthenticationToken": ""
  },
  "commandLine": []
}

Pull doAzureParallel from the container development branch instead of master

install.packages("devtools", dependencies = TRUE)
library(devtools)

# Pull doAzureParallel from the 'container' branch
install_github("azure/doAzureParallel", ref="feature/container")
library(doAzureParallel)

setCredentials("credentials.json")

# Register the pool. This will create a new pool if your pool hasn't already been provisioned.
cluster <- makeCluster("cluster.json")

# Register the pool as your parallel backend
registerDoAzureParallel(cluster)

# Check that your parallel backend has been registered
getDoParWorkers()

# Run a forloop and make sure all of the require packages are loaded
results <- foreach(i = 1:3) %dopar% {
 # my algo
}

results

Please note that this is a development branch and not stable so can change at any time. I don't expect many breaking changes, but it can happen.

Thanks, -Pablo

QuantMonk commented 7 years ago

This is great Pablo. For now I unblocked my self by refactoring my code a bit, but I would give this a try soon.

Let me tell this one more time - amazing work guys! Really loving it. The compute nodes were all there before, but it was such a pain to get them clustered and running. These libraries save so much time and speed up the whole testing process.

paselem commented 7 years ago

@QuantMonk The container feature is now live as version v0.6.0. Let us know what you think!