Azure / doAzureParallel

A R package that allows users to submit parallel workloads in Azure
MIT License
107 stars 50 forks source link

Nodes fail to start when loading package from CRAN #360

Open benscarlson opened 4 years ago

benscarlson commented 4 years ago

Hello,

I am able to run the sample "getting started" code but when I attempt to load a package from CRAN, my nodes fail to start. Can anyone show me what I might be doing wrong?

Here is my cluster file. This is the same file from "getting started" but I've included the "hypervolume" package.

{
  "name": "hv",
  "vmSize": "Standard_D2_v2",
  "maxTasksPerNode": 2,
  "poolSize": {
    "dedicatedNodes": {
      "min": 0,
      "max": 0
    },
    "lowPriorityNodes": {
      "min": 5,
      "max": 10
    },
    "autoscaleFormula": "QUEUE"
  },
  "containerImage": "rocker/tidyverse:latest",
  "rPackages": {
    "cran": ["hypervolume"],
    "github": [],
    "bioconductor": []
  },
  "commandLine": [],
  "subnetId": ""
}

I run the following:

library(doAzureParallel)
setCredentials('credentials.json')
cluster <- makeCluster("cluster.json")

But all my nodes fail to boot. Here is an example:

1: In .showNodesFailure(nodesWithFailures) :
  The following 1 nodes failed while running the start task:
tvm-2487789449_4-20190722t172941z-p

This package has some other dependencies, including rgeos which can be finicky to install. I've tried specifying just rgeos by itself but that also fails. Is there a better way to install packages?

Thank you!

zerweck commented 4 years ago

From my personal experience: When the package installation might be a problem, maybe try creating a custom docker image with your packages pre-installed as described here https://github.com/Azure/doAzureParallel/blob/master/docs/33-building-containers.md