Closed MilesAheadAlso closed 6 years ago
{ "name": "smallPool", "vmSize": "Standard_D2s_v3", "maxTasksPerNode": 1, "poolSize": { "dedicatedNodes": { "min": 10, "max": 10 }, "lowPriorityNodes": { "min": 0, "max": 10 }, "autoscaleFormula": "QUEUE" }, "rPackages": { "cran": ["TTR", "forecast", "seasonal", "dplyr", "forecastHybrid", "nnet", "foreach", "doParallel"], "github": ["EdwinTh/padr"], "githubAuthenticationToken": "" }, "commandLine": [] }
Hi @MilesAheadAlso - I am pretty sure that the issue is the machine size you're using. Currently our backend does not support any of the 'S' type Virtual Machines. Can you try Standard_D2_v3 instead?
Hi Pablo
I tried that VM, same result. Below in the RStudio console info, the cluster.json, and a screenshot from the Azure console.
RStudio Console
library(doAzureParallel) Loading required package: foreach foreach: simple, scalable parallel programming from Revolution Analytics Use Revolution R for scalability, fault tolerance and more. http://www.revolutionanalytics.com Loading required package: iterators
3. Set your credentials - you need to give the R session your
credentials to interact with Azure setCredentials("credentials.json") [1] "Your azure credentials have been set."
4. Register the pool. This will create a new pool if your pool hasn't
already been provisioned. startTime <- Sys.time() cluster <- makeCluster("smallVM_cluster.json") Booting compute nodes. . .
|===============================================================================================================================| 100% Your cluster has been registered. Dedicated Node Count: 1 Low Priority Node Count: 0 Warning message: In waitForNodesToComplete(poolConfig$name, 60000) : The following 1 nodes failed while running the start task: tvm-1763885094_1-20171104t101003z
endTime <- Sys.time() difftime(endTime,startTime) Time difference of 10.10355 mins
5. Register the pool as your parallel backend
registerDoAzureParallel(cluster)
6. Check that your parallel backend has been registered
getDoParWorkers() [1] 11 nItems <- 20 results <- foreach(i = 1:nItems) %dopar% {
- itemHistory <- subset(History, ForecastItem == ForecastItem[i])
- my_single_function(itemHistory, fcstList,
- fcstOffset, fcstPeriods, fcstSeason,
- dateMin, dateMax, weekDays,
- bWrite, bClean, cAccuracy)
- } Job Summary: Id: job20171104101832 Waiting for tasks to complete. . . | | 0%
smallVM_cluster.json
{ "name": "smallBaxterPool", "vmSize": "Standard_D2s_v3", "maxTasksPerNode": 1, "poolSize": { "dedicatedNodes": { "min": 1, "max": 1 }, "lowPriorityNodes": { "min": 0, "max": 10 }, "autoscaleFormula": "QUEUE" }, "rPackages": { "cran": ["TTR", "forecast", "seasonal", "dplyr", "forecastHybrid", "nnet", "foreach", "doParallel"], "github": ["EdwinTh/padr"], "githubAuthenticationToken": "" }, "commandLine": [] }
On Fri, Nov 3, 2017 at 12:47 PM, Pablo Selem notifications@github.com wrote:
Hi @MilesAheadAlso https://github.com/milesaheadalso - I am pretty sure that the issue is the machine size you're using. Currently our backend does not support any of the 'S' type Virtual Machines. Can you try Standard_D2_v3 instead?
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/Azure/doAzureParallel/issues/157#issuecomment-341761463, or mute the thread https://github.com/notifications/unsubscribe-auth/AewVaHEUZY3h_mopLiGEVm25x7O3QPxNks5sy0O1gaJpZM4QRXpv .
Sorry Pablo, I realized that I did not save smallVM_cluster.json so I used the D2s_V3 VM again. I've saved the json file and started the process again. My bad,
I get exactly the same result with D2_V3
Now trying with a D11
OK, so I've tried lots of stuff. It comes down to loading some packages. I do not know for certain all the packages that create a problem but padr definitely does. The cluster.json that worked is below.
{ "name": "smallBaxterPool", "vmSize": "Standard_D4_V3", "maxTasksPerNode": 1, "poolSize": { "dedicatedNodes": { "min": 1, "max": 1 }, "lowPriorityNodes": { "min": 0, "max": 10 }, "autoscaleFormula": "QUEUE" }, "rPackages": { "cran": ["TTR", "forecast", "seasonal", "forecastHybrid", "nnet"], "github": [], "githubAuthenticationToken": "" }, "commandLine": [] }
Every time I included padr, whether under cran or gitbub, the cluster creating failed.
Hi @MilesAheadAlso, Two things w.r.t. you tests. First off, I tested it on our latest release and I can't seem to reproduce the error (my cluster comes up). That said, our latest release has a major difference from the previous one which is that we run the latest CRAN R by default rather than Microsoft R Open 3.3.2. Below is the config file I used and the R script to test it.
Note - I'm using Standard_F2 because they are pretty cheap but have low RAM, but any VM size should behave the same with the exception of the 'S' variants.
Note - In order to use this, you will need to reinstall the latest version of doAzureParallel (0.6.0) and it's dependencies.
{
"name": "padr",
"vmSize": "Standard_F2",
"maxTasksPerNode": 1,
"poolSize": {
"dedicatedNodes": {
"min": 0,
"max": 0
},
"lowPriorityNodes": {
"min": 2,
"max": 2
},
"autoscaleFormula": "QUEUE"
},
"rPackages": {
"cran": ["TTR",
"forecast",
"seasonal",
"dplyr",
"forecastHybrid",
"nnet",
"foreach",
"doParallel"],
"github": ["EdwinTh/padr"],
"githubAuthenticationToken": ""
},
"commandLine": []
}
Here is the sample script I ran, including the sample from the padr readme
res <-
foreach::foreach(i = 1:2) %dopar% {
library(padr)
library(tidyverse)
coffee <- data.frame(
time_stamp = as.POSIXct(c(
'2016-07-07 09:11:21', '2016-07-07 09:46:48',
'2016-07-09 13:25:17',
'2016-07-10 10:45:11'
)),
amount = c(3.14, 2.98, 4.11, 3.14)
)
coffee %>%
thicken('day') %>%
dplyr::group_by(time_stamp_day) %>%
dplyr::summarise(day_amount = sum(amount)) %>%
pad() %>%
fill_by_value(day_amount, value = 0)
}
res
And these are the results
[[1]]
time_stamp_day day_amount
1 2016-07-07 6.12
2 2016-07-08 0.00
3 2016-07-09 4.11
4 2016-07-10 3.14
[[2]]
time_stamp_day day_amount
1 2016-07-07 6.12
2 2016-07-08 0.00
3 2016-07-09 4.11
4 2016-07-10 3.14
That said, I understand this may be pretty unsatisfactory to answer your original question. If you would like to debug further, we would need some of the logs off of one of you cluster nodes. Please follow the troubleshooting steps to get the log files. If there is no immediately obvious reason, please send them our way and we can help take a look.
Thanks, -Pablo
Thx Pablo. So I should just reinstall the latest version of doAzureParallel?
On Sun, Nov 5, 2017 at 3:51 PM, Pablo Selem notifications@github.com wrote:
Hi @MilesAheadAlso https://github.com/milesaheadalso, Two things w.r.t. you tests. First off, I tested it on our latest release and I can't seem to reproduce the error (my cluster comes up). That said, our latest release has a major difference from the previous one which is that we run the latest CRAN R by default rather than Microsoft R Open 3.3.2. Below is the config file I used and the R script to test it.
Note - I'm using Standard_F2 because they are pretty cheap and have low RAM, but and VM size should behave the same with the exception of the 'S' variants.
Note - In order to use this, you will need to reinstall the latest version of doAzureParallel (0.6.0) and it's dependencies.
{ "name": "padr", "vmSize": "Standard_F2", "maxTasksPerNode": 1, "poolSize": { "dedicatedNodes": { "min": 0, "max": 0 }, "lowPriorityNodes": { "min": 2, "max": 2 }, "autoscaleFormula": "QUEUE" }, "rPackages": { "cran": ["TTR", "forecast", "seasonal", "dplyr", "forecastHybrid", "nnet", "foreach", "doParallel"], "github": ["EdwinTh/padr"], "githubAuthenticationToken": "" }, "commandLine": [] }
Here is the sample script I ran, including the sample from the padr readme
res <- foreach::foreach(i = 1:2) %dopar% { library(padr) library(tidyverse) coffee <- data.frame( time_stamp = as.POSIXct(c( '2016-07-07 09:11:21', '2016-07-07 09:46:48',
'2016-07-09 13:25:17', '2016-07-10 10:45:11' )), amount = c(3.14, 2.98, 4.11, 3.14) ) coffee %>% thicken('day') %>% dplyr::group_by(time_stamp_day) %>% dplyr::summarise(day_amount = sum(amount)) %>% pad() %>% fill_by_value(day_amount, value = 0) }
res
And these are the results
[[1]] time_stamp_day day_amount1 2016-07-07 6.122 2016-07-08 0.003 2016-07-09 4.114 2016-07-10 3.14
[[2]] time_stamp_day day_amount1 2016-07-07 6.122 2016-07-08 0.003 2016-07-09 4.114 2016-07-10 3.14
That said, I understand this may be pretty unsatisfactory to answer your original question. If you would like to debug further, we would need some of the logs off of one of you cluster nodes. Please follow the troubleshooting steps https://github.com/Azure/doAzureParallel/blob/master/docs/40-troubleshooting.md to get the log files. If there is no immediately obvious reason, please send them our way and we can help take a look.
Thanks, -Pablo
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/Azure/doAzureParallel/issues/157#issuecomment-342004441, or mute the thread https://github.com/notifications/unsubscribe-auth/AewVaMhGdDBAh2Zp7NNWbdwkd230O1iTks5szh_KgaJpZM4QRXpv .
Yes, that is correct. Something like this should do the trick:
library(devtools)
devtools::install_github('azure/doAzureParallel', force = TRUE, ref = 'v0.6.0')
library(doAzureParallel)
Closing this issue assuming that the sample above has addressed the issue. Feel free to re-open if that is not the case.
Hi, I face this issue too but intermittently. I am not sure if this issue is region specific. when I try the same thing on my "south India" azure account it works just fine.
I am trying to do best-fir statistical forecasting that requires me to load several packages, some of which are not yet available in the version of R loaded with the VM. I've tried loading these from GitHub which does not seem to work. The only feedback I get is that the VMs failed to start, so I'm guessing.
I'm not sure if I need to load devtools in cluster.json and then use install_github, but this would be very costly because it would need to be run in each foreach loop.
This is the cluster.json I am using:
{ "name": "smallPool", "vmSize": "Standard_D2s_v3", "maxTasksPerNode": 1, "poolSize": { "dedicatedNodes": { "min": 10, "max": 10 }, "lowPriorityNodes": { "min": 0, "max": 10 }, "autoscaleFormula": "QUEUE" }, "rPackages": { "cran": ["TTR", "forecast", "seasonal", "dplyr", "forecastHybrid", "nnet", "foreach", "doParallel"], "github": ["EdwinTh/padr"], "githubAuthenticationToken": "" }, "commandLine": [] }