Azure / doAzureParallel

A R package that allows users to submit parallel workloads in Azure
MIT License
107 stars 51 forks source link

"The specified pool does not exist" when machine type not available in region(?) #240

Closed amanka closed 6 years ago

amanka commented 6 years ago

The F*v2 series is available in US East, but not US West.

https://azure.microsoft.com/en-us/pricing/details/batch/ (toggle between US East and West to see).

When trying to spin up a batch type that isn't available, the failure message is vague. I get the canonical machine type names from here.

I assume this is because of the machine type being unavailable, but I don't have an East US batch account to try it with. If this is the case, a better error message would be helpful. Pages like this are some of the only ways to get a list of acceptable Azure names for doAzureParallel, and they don't indicate the regions for machines.

Instruction to repro the problem if applicable

Make f2.json:

{
  "name": "hpc4",
  "vmSize": "Standard_F2s_v2",
  "maxTasksPerNode": 1,
  "poolSize": {
    "dedicatedNodes": {
      "min": 1,
      "max": 1
    },
    "lowPriorityNodes": {
      "min": 1,
      "max": 1
    },
    "autoscaleFormula": "QUEUE"
  },
  "containerImage": "rocker/tidyverse:latest",
  "rPackages": {
    "cran": [],
    "github": [],
    "bioconductor": []
  },
  "commandLine": []
}

Try to load it:

library(doAzureParallel)
Loading required package: foreach
Loading required package: iterators
> setVerbose(TRUE)
> setCredentials("credentials.json")
[1] "Your azure credentials have been set."
> my.cl <- makeCluster("f2.json")
=======================================================================================================
Name: hpc4
Configuration:
    Docker Image: rocker/tidyverse:latest
    MaxTasksPerNode: 1
    Node Size: Standard_F2s_v2
Scale:
    Autoscale Formula: QUEUE
    Dedicated:
        Min: 1
        Max: 1
    Low Priority:
        Min: 1
        Max: 1
=======================================================================================================
Booting compute nodes. . . 
Error in waitForNodesToComplete(poolConfig$name, 60000) : 
  Code: PoolNotFound - Message: en-USCode: PoolNotFound - Message: The specified pool does not exist.

sessioninfo():


R version 3.4.3 (2017-11-30)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows >= 8 x64 (build 9200)

Matrix products: default

locale:
[1] LC_COLLATE=English_United States.1252  LC_CTYPE=English_United States.1252   
[3] LC_MONETARY=English_United States.1252 LC_NUMERIC=C                          
[5] LC_TIME=English_United States.1252    

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] doAzureParallel_0.6.2 iterators_1.0.9       foreach_1.4.4        

loaded via a namespace (and not attached):
Error in x[["Version"]] : subscript out of bounds
In addition: Warning message:
In FUN(X[[i]], ...) :
  DESCRIPTION file of package 'yaml' is missing or broken```
paselem commented 6 years ago

Thanks for bringing this up @amanka. Unfortunately there is no good API for us to call into at the moment to know which VMs are available in different regions. That said I think we can do a better job at surfacing the right error. When I tried out your config and looked at the traffic we do see an error in the initial poolCreate call:

HTTP/1.1 400 The value provided for one of the properties in the request body is invalid.

The body of that message should give us a better understanding of the real issue.

zfengms commented 6 years ago

I will look into fixing the error message.

zfengms commented 6 years ago

fixed with #241