Closed mooibroekd closed 2 weeks ago
Hi @mooibroekd sorry I only came upon this after replying to your query at https://github.com/shikokuchuo/mirai/issues/146.
I've looked over the code here and it seems fine. I strongly suspect you might be using an older version of the package which doesn't yet use mirai
- from the commit history, it seems the switch was made only last month.
You can try grabbing the latest from GitHub using something like pak::pak("davidcarslaw/deweather")
.
@shikokuchuo Thanks for responding. I managed to get the conditions when this error is happening.
It happens when I change the cv.fold
parameter from its default setting of 0 to anything above 1 (i.e. 0 and 1 work).
library(deweather)
library(dplyr)
#>
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#>
#> filter, lag
#> The following objects are masked from 'package:base':
#>
#> intersect, setdiff, setequal, union
library(openair)
dat_part <- selectByDate(road_data, year = 2001:2004)
# to speed up the example, sample rows in `dat_part`
# not needed in reality
dat_part <- dplyr::slice_sample(dat_part, prop = 1/10)
mod_no2 <- buildMod(
dat_part,
vars = c("trend", "ws", "wd", "hour", "weekday", "air_temp", "week"),
pollutant = "no2",
n.trees = 1000,
n.core = 4,
cv.folds = 5
)
#> Error: Error in serverSocket(port = port): creation of server socket failed: port 11257 cannot be opened
Created on 2024-09-07 with reprex v2.1.1
Seems there is some leftover code after the change to mirai
.
Reprex to show that a value of 0 is working.
library(deweather)
library(dplyr)
#>
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#>
#> filter, lag
#> The following objects are masked from 'package:base':
#>
#> intersect, setdiff, setequal, union
library(openair)
dat_part <- selectByDate(road_data, year = 2001:2004)
# to speed up the example, sample rows in `dat_part`
# not needed in reality
dat_part <- dplyr::slice_sample(dat_part, prop = 1/10)
mod_no2 <- buildMod(
dat_part,
vars = c("trend", "ws", "wd", "hour", "weekday", "air_temp", "week"),
pollutant = "no2",
n.trees = 1000,
n.core = 4,
cv.folds = 0
)
glimpse(mod_no2)
#> List of 4
#> $ model :List of 27
#> ..$ initF : num 95
#> ..$ fit : num [1:3354] 113.1 121.8 136.1 53.9 119.4 ...
#> ..$ train.error : num [1:1000] 1852 1688 1566 1467 1381 ...
#> ..$ valid.error : num [1:1000] NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN ...
#> ..$ oobag.improve : num [1:1000] 183.5 147.8 123.9 97.2 87.5 ...
#> ..$ trees :List of 1000
#> .. ..$ :List of 8
#> .. ..$ :List of 8
#> .. ..$ :List of 8
#> .. ..$ :List of 8
#> .. ..$ :List of 8
#> .. ..$ :List of 8
#> .. ..$ :List of 8
#> .. ..$ :List of 8
#> .. ..$ :List of 8
#> .. ..$ :List of 8
#> .. ..$ :List of 8
#> .. ..$ :List of 8
#> .. ..$ :List of 8
#> .. ..$ :List of 8
#> .. ..$ :List of 8
#> .. ..$ :List of 8
#> .. ..$ :List of 8
#> .. ..$ :List of 8
#> .. ..$ :List of 8
#> .. ..$ :List of 8
#> .. ..$ :List of 8
#> .. ..$ :List of 8
#> .. ..$ :List of 8
#> .. ..$ :List of 8
#> .. ..$ :List of 8
#> .. ..$ :List of 8
#> .. ..$ :List of 8
#> .. ..$ :List of 8
#> .. ..$ :List of 8
#> .. ..$ :List of 8
#> .. ..$ :List of 8
#> .. ..$ :List of 8
#> .. ..$ :List of 8
#> .. ..$ :List of 8
#> .. ..$ :List of 8
#> .. ..$ :List of 8
#> .. ..$ :List of 8
#> .. ..$ :List of 8
#> .. ..$ :List of 8
#> .. ..$ :List of 8
#> .. ..$ :List of 8
#> .. ..$ :List of 8
#> .. ..$ :List of 8
#> .. ..$ :List of 8
#> .. ..$ :List of 8
#> .. ..$ :List of 8
#> .. ..$ :List of 8
#> .. ..$ :List of 8
#> .. ..$ :List of 8
#> .. ..$ :List of 8
#> .. ..$ :List of 8
#> .. ..$ :List of 8
#> .. ..$ :List of 8
#> .. ..$ :List of 8
#> .. ..$ :List of 8
#> .. ..$ :List of 8
#> .. ..$ :List of 8
#> .. ..$ :List of 8
#> .. ..$ :List of 8
#> .. ..$ :List of 8
#> .. ..$ :List of 8
#> .. ..$ :List of 8
#> .. ..$ :List of 8
#> .. ..$ :List of 8
#> .. ..$ :List of 8
#> .. ..$ :List of 8
#> .. ..$ :List of 8
#> .. ..$ :List of 8
#> .. ..$ :List of 8
#> .. ..$ :List of 8
#> .. ..$ :List of 8
#> .. ..$ :List of 8
#> .. ..$ :List of 8
#> .. ..$ :List of 8
#> .. ..$ :List of 8
#> .. ..$ :List of 8
#> .. ..$ :List of 8
#> .. ..$ :List of 8
#> .. ..$ :List of 8
#> .. ..$ :List of 8
#> .. ..$ :List of 8
#> .. ..$ :List of 8
#> .. ..$ :List of 8
#> .. ..$ :List of 8
#> .. ..$ :List of 8
#> .. ..$ :List of 8
#> .. ..$ :List of 8
#> .. ..$ :List of 8
#> .. ..$ :List of 8
#> .. ..$ :List of 8
#> .. ..$ :List of 8
#> .. ..$ :List of 8
#> .. ..$ :List of 8
#> .. ..$ :List of 8
#> .. ..$ :List of 8
#> .. ..$ :List of 8
#> .. ..$ :List of 8
#> .. ..$ :List of 8
#> .. ..$ :List of 8
#> .. .. [list output truncated]
#> ..$ c.splits :List of 626
#> .. ..$ : int [1:7] 1 1 -1 -1 1 1 1
#> .. ..$ : int [1:7] 1 1 -1 -1 1 1 1
#> .. ..$ : int [1:7] 1 1 -1 -1 1 1 1
#> .. ..$ : int [1:7] 1 1 -1 -1 1 1 1
#> .. ..$ : int [1:7] 1 1 -1 -1 1 1 1
#> .. ..$ : int [1:7] 1 1 -1 -1 1 1 1
#> .. ..$ : int [1:7] 1 1 -1 -1 1 1 1
#> .. ..$ : int [1:7] 1 1 1 -1 1 1 1
#> .. ..$ : int [1:7] 1 -1 -1 -1 1 1 1
#> .. ..$ : int [1:7] 1 1 -1 -1 1 1 1
#> .. ..$ : int [1:7] 1 1 -1 -1 1 1 1
#> .. ..$ : int [1:7] 1 -1 -1 -1 1 1 1
#> .. ..$ : int [1:7] 1 1 1 -1 1 1 1
#> .. ..$ : int [1:7] 1 1 -1 -1 1 1 1
#> .. ..$ : int [1:7] 1 1 -1 -1 1 1 1
#> .. ..$ : int [1:7] 1 1 1 -1 -1 1 -1
#> .. ..$ : int [1:7] 1 1 -1 -1 1 1 1
#> .. ..$ : int [1:7] 1 -1 -1 -1 1 1 1
#> .. ..$ : int [1:7] 1 1 -1 -1 1 1 1
#> .. ..$ : int [1:7] -1 -1 -1 -1 -1 1 -1
#> .. ..$ : int [1:7] 1 -1 -1 -1 1 -1 1
#> .. ..$ : int [1:7] 1 -1 -1 -1 1 1 -1
#> .. ..$ : int [1:7] -1 1 -1 -1 1 1 1
#> .. ..$ : int [1:7] 1 1 1 -1 1 1 1
#> .. ..$ : int [1:7] -1 -1 -1 -1 1 1 1
#> .. ..$ : int [1:7] 1 -1 -1 -1 1 1 1
#> .. ..$ : int [1:7] 1 -1 1 -1 -1 1 -1
#> .. ..$ : int [1:7] 1 1 1 -1 1 1 1
#> .. ..$ : int [1:7] 1 -1 -1 -1 1 1 -1
#> .. ..$ : int [1:7] -1 -1 -1 -1 1 1 1
#> .. ..$ : int [1:7] 1 -1 1 -1 1 1 1
#> .. ..$ : int [1:7] 1 1 1 -1 1 1 1
#> .. ..$ : int [1:7] -1 1 1 -1 1 -1 1
#> .. ..$ : int [1:7] 1 -1 -1 1 1 -1 1
#> .. ..$ : int [1:7] 1 1 1 -1 1 1 -1
#> .. ..$ : int [1:7] 1 1 -1 -1 -1 1 -1
#> .. ..$ : int [1:7] 1 -1 -1 -1 1 1 1
#> .. ..$ : int [1:7] 1 1 -1 -1 1 1 -1
#> .. ..$ : int [1:7] -1 1 -1 1 -1 1 -1
#> .. ..$ : int [1:7] -1 1 -1 -1 1 -1 -1
#> .. ..$ : int [1:7] -1 -1 -1 -1 1 -1 -1
#> .. ..$ : int [1:7] -1 -1 -1 1 1 -1 1
#> .. ..$ : int [1:7] -1 -1 -1 -1 1 -1 1
#> .. ..$ : int [1:7] -1 -1 -1 -1 -1 1 -1
#> .. ..$ : int [1:7] 1 1 1 -1 1 1 1
#> .. ..$ : int [1:7] -1 1 -1 -1 1 1 1
#> .. ..$ : int [1:7] 1 1 -1 -1 1 1 1
#> .. ..$ : int [1:7] 1 -1 1 1 -1 -1 -1
#> .. ..$ : int [1:7] 1 -1 1 1 1 -1 1
#> .. ..$ : int [1:7] -1 1 -1 -1 1 1 1
#> .. ..$ : int [1:7] -1 -1 -1 -1 1 -1 -1
#> .. ..$ : int [1:7] -1 -1 -1 -1 -1 1 1
#> .. ..$ : int [1:7] -1 1 -1 1 -1 1 -1
#> .. ..$ : int [1:7] 1 -1 1 1 -1 1 1
#> .. ..$ : int [1:7] 1 -1 1 1 1 -1 1
#> .. ..$ : int [1:7] -1 -1 1 1 -1 1 -1
#> .. ..$ : int [1:7] 1 1 -1 -1 1 -1 1
#> .. ..$ : int [1:7] 1 1 -1 -1 1 -1 -1
#> .. ..$ : int [1:7] -1 -1 -1 -1 1 1 -1
#> .. ..$ : int [1:7] 1 -1 -1 1 1 1 1
#> .. ..$ : int [1:7] -1 -1 1 -1 1 1 1
#> .. ..$ : int [1:7] 1 1 -1 -1 -1 1 -1
#> .. ..$ : int [1:7] -1 1 1 -1 -1 -1 -1
#> .. ..$ : int [1:7] 1 -1 -1 1 1 1 -1
#> .. ..$ : int [1:7] -1 -1 -1 -1 1 -1 -1
#> .. ..$ : int [1:7] 1 -1 1 -1 -1 1 -1
#> .. ..$ : int [1:7] 1 1 1 1 -1 -1 -1
#> .. ..$ : int [1:7] -1 1 -1 -1 -1 1 -1
#> .. ..$ : int [1:7] 1 -1 1 1 1 -1 -1
#> .. ..$ : int [1:7] -1 -1 -1 -1 1 1 -1
#> .. ..$ : int [1:7] 1 1 -1 -1 1 1 1
#> .. ..$ : int [1:7] 1 1 1 1 -1 1 1
#> .. ..$ : int [1:7] 1 1 -1 1 -1 -1 1
#> .. ..$ : int [1:7] -1 -1 1 -1 1 1 1
#> .. ..$ : int [1:7] 1 -1 -1 1 -1 -1 -1
#> .. ..$ : int [1:7] 1 1 1 1 -1 -1 -1
#> .. ..$ : int [1:7] 1 -1 1 -1 1 -1 1
#> .. ..$ : int [1:7] -1 -1 -1 -1 -1 1 1
#> .. ..$ : int [1:7] -1 -1 -1 1 -1 1 1
#> .. ..$ : int [1:7] 1 1 -1 -1 1 -1 -1
#> .. ..$ : int [1:7] -1 1 -1 1 1 -1 -1
#> .. ..$ : int [1:7] -1 -1 -1 -1 1 -1 1
#> .. ..$ : int [1:7] 1 1 1 -1 1 -1 -1
#> .. ..$ : int [1:7] -1 -1 1 1 1 -1 1
#> .. ..$ : int [1:7] 1 1 -1 1 -1 1 1
#> .. ..$ : int [1:7] -1 1 1 1 -1 1 1
#> .. ..$ : int [1:7] 1 1 1 1 1 1 -1
#> .. ..$ : int [1:7] 1 -1 -1 -1 -1 -1 -1
#> .. ..$ : int [1:7] -1 -1 1 -1 1 -1 -1
#> .. ..$ : int [1:7] -1 1 -1 -1 -1 -1 -1
#> .. ..$ : int [1:7] 1 -1 -1 -1 -1 -1 -1
#> .. ..$ : int [1:7] -1 -1 -1 -1 1 1 -1
#> .. ..$ : int [1:7] 1 1 1 -1 -1 -1 1
#> .. ..$ : int [1:7] 1 1 -1 1 1 1 1
#> .. ..$ : int [1:7] -1 -1 1 1 -1 -1 -1
#> .. ..$ : int [1:7] 1 1 1 1 -1 1 1
#> .. ..$ : int [1:7] 1 1 1 -1 -1 -1 1
#> .. ..$ : int [1:7] -1 -1 -1 -1 -1 1 -1
#> .. ..$ : int [1:7] 1 -1 1 -1 -1 1 1
#> .. .. [list output truncated]
#> ..$ bag.fraction : num 0.5
#> ..$ distribution :List of 1
#> .. ..$ name: chr "gaussian"
#> ..$ interaction.depth: num 5
#> ..$ n.minobsinnode : num 10
#> ..$ num.classes : num 1
#> ..$ n.trees : num 1000
#> ..$ nTrain : num 3354
#> ..$ train.fraction : num 1
#> ..$ response.name : chr "no2"
#> ..$ shrinkage : num 0.1
#> ..$ var.levels :List of 7
#> .. ..$ : Named num [1:11] 9.78e+08 9.90e+08 1.00e+09 1.02e+09 1.03e+09 ...
#> .. .. ..- attr(*, "names")= chr [1:11] "0%" "10%" "20%" "30%" ...
#> .. ..$ : Named num [1:11] 0.333 1.367 2.1 2.767 3.267 ...
#> .. .. ..- attr(*, "names")= chr [1:11] "0%" "10%" "20%" "30%" ...
#> .. ..$ : Named num [1:11] 0.217 40 86.841 152.442 190 ...
#> .. .. ..- attr(*, "names")= chr [1:11] "0%" "10%" "20%" "30%" ...
#> .. ..$ : Named num [1:11] 0 2 4 7 9 12 14 17 19 21 ...
#> .. .. ..- attr(*, "names")= chr [1:11] "0%" "10%" "20%" "30%" ...
#> .. ..$ : chr [1:7] "Friday" "Monday" "Saturday" "Sunday" ...
#> .. ..$ : Named num [1:11] -4.35 3.95 6.35 8.3 9.9 ...
#> .. .. ..- attr(*, "names")= chr [1:11] "0%" "10%" "20%" "30%" ...
#> .. ..$ : Named num [1:11] 0 5 10 15 20 ...
#> .. .. ..- attr(*, "names")= chr [1:11] "0%" "10%" "20%" "30%" ...
#> ..$ var.monotone : num [1:7] 0 0 0 0 0 0 0
#> ..$ var.names : chr [1:7] "trend" "ws" "wd" "hour" ...
#> ..$ var.type : num [1:7] 0 0 0 0 7 0 0
#> ..$ verbose : logi FALSE
#> ..$ data :List of 6
#> .. ..$ y : Named num [1:3354] 130 122 94 55 115 86 73 109 59 46 ...
#> .. .. ..- attr(*, "names")= chr [1:3354] "1" "2" "3" "4" ...
#> .. ..$ x : num [1:23478] 1.03e+09 1.04e+09 1.07e+09 1.03e+09 1.07e+09 ...
#> .. ..$ x.order: num [1:3354, 1:7] 1046 1633 3300 3127 1161 ...
#> .. .. ..- attr(*, "dimnames")=List of 2
#> .. ..$ offset : logi NA
#> .. ..$ Misc : logi NA
#> .. ..$ w : num [1:3354] 1 1 1 1 1 1 1 1 1 1 ...
#> ..$ Terms :Classes 'terms', 'formula' language no2 ~ trend + ws + wd + hour + weekday + air_temp + week
#> .. .. ..- attr(*, "variables")= language list(no2, trend, ws, wd, hour, weekday, air_temp, week)
#> .. .. ..- attr(*, "factors")= int [1:8, 1:7] 0 1 0 0 0 0 0 0 0 0 ...
#> .. .. .. ..- attr(*, "dimnames")=List of 2
#> .. .. ..- attr(*, "term.labels")= chr [1:7] "trend" "ws" "wd" "hour" ...
#> .. .. ..- attr(*, "order")= int [1:7] 1 1 1 1 1 1 1
#> .. .. ..- attr(*, "intercept")= int 1
#> .. .. ..- attr(*, "response")= int 1
#> .. .. ..- attr(*, ".Environment")=<environment: 0x0000024762e4f320>
#> .. .. ..- attr(*, "predvars")= language list(no2, trend, ws, wd, hour, weekday, air_temp, week)
#> .. .. ..- attr(*, "dataClasses")= Named chr [1:8] "numeric" "numeric" "numeric" "numeric" ...
#> .. .. .. ..- attr(*, "names")= chr [1:8] "no2" "trend" "ws" "wd" ...
#> ..$ cv.folds : num 0
#> ..$ call : language gbm::gbm(formula = eq, distribution = "gaussian", data = dat, n.trees = n.trees, interaction.depth = interac| __truncated__ ...
#> ..$ m : language model.frame(formula = eq, data = dat, drop.unused.levels = TRUE, na.action = function (object, ...) ...
#> ..- attr(*, "class")= chr "gbm"
#> $ influence: tibble [7 × 4] (S3: tbl_df/tbl/data.frame)
#> ..$ var : Factor w/ 7 levels "week","ws","air_temp",..: 7 6 5 4 3 2 1
#> .. ..- attr(*, "scores")= num [1:7(1d)] 5.16 7.15 7.44 10.27 16.66 ...
#> .. .. ..- attr(*, "dimnames")=List of 1
#> ..$ mean : num [1:7] 33.98 19.35 16.66 10.27 7.44 ...
#> ..$ lower: Named num [1:7] 33.61 18.77 16.62 10.03 7.39 ...
#> .. ..- attr(*, "names")= chr [1:7] "2.5%" "2.5%" "2.5%" "2.5%" ...
#> ..$ upper: Named num [1:7] 34 19.39 17.06 10.29 8.47 ...
#> .. ..- attr(*, "names")= chr [1:7] "97.5%" "97.5%" "97.5%" "97.5%" ...
#> $ data : tibble [3,354 × 9] (S3: tbl_df/tbl/data.frame)
#> ..$ date : POSIXct[1:3354], format: "2002-07-10 16:00:00" "2002-11-22 22:00:00" ...
#> ..$ trend : num [1:3354] 1.03e+09 1.04e+09 1.07e+09 1.03e+09 1.07e+09 ...
#> ..$ ws : num [1:3354] 7.03 4.43 4.1 4.97 8.03 ...
#> ..$ wd : num [1:3354] 216.6 176.9 83.7 21.4 80 ...
#> ..$ hour : int [1:3354] 16 22 17 8 12 1 23 23 10 3 ...
#> ..$ weekday : Factor w/ 7 levels "Friday","Monday",..: 7 1 2 6 5 7 2 1 6 2 ...
#> ..$ air_temp: num [1:3354] 19 8.93 10.07 12.93 15.37 ...
#> ..$ week : num [1:3354] 27 46 48 38 41 15 11 30 31 34 ...
#> ..$ no2 : num [1:3354] 130 122 94 55 115 86 73 109 59 46 ...
#> ..- attr(*, "na.action")= 'omit' Named int [1:152] 1 28 72 78 103 138 168 179 181 196 ...
#> .. ..- attr(*, "names")= chr [1:152] "1" "28" "72" "78" ...
#> $ pd : tibble [7 × 3] (S3: tbl_df/tbl/data.frame)
#> ..$ var : chr [1:7] "air_temp" "hour" "trend" "wd" ...
#> ..$ numeric :List of 7
#> ..$ character:List of 7
#> - attr(*, "class")= chr "deweather"
Created on 2024-09-07 with reprex v2.1.1
@shikokuchuo Thanks for responding. I managed to get the conditions when this error is happening.
It happens when I change the
cv.fold
parameter from its default setting of 0 to anything above 1 (i.e. 0 and 1 work).
Right. I'll leave it to @davidcarslaw to comment further but from what I can see, it's because gbm::gbm()
uses parallel
to parallelize over the cross validation folds. The error comes from there, and I'm not sure how the ports are determined or why they should fail on occasion.
Thank you for raising this issue!
I could believe there's something unusual going on with gbm::gbm()
trying to use {parallel}
from within a {mirai}
deamon, but curiously when I set n.cores = 1
on gbm::gbm()
(which is meant to disable that behaviour when cv.folds > 1
) I still get that socket error.
{gbm}
(and the non-CRAN {gbm3}
) seem to be stalled in development. I'll discuss with @davidcarslaw. It may be that we, sadly, revert back to using {parallel}
as it was working fine before (as much as I like {mirai}
!)
@jack-davison @davidcarslaw I've included a fix in #26. It just sets the number of cores that gbm uses to 1 so that it doesn't try to spin up new nested child processes which fail (as they try to use the same port).
This should be the desired behaviour so that the number of cores set by this package is respected.
Thanks for the discussions on this Denis & Charlie; I think there's something internal in {gbm}
that's causing issues even if its parallelisation is turned off, so I'm closing this for now as of #27, but we'll circle back to {mirai}
later down the line.
When running
buildMod
occasionally the following error pops up:Port xxxx is a port number that seem to vary.
I suspect that this has to do with the creation of parallel clusters for doing the calculations and this might not even be related to deweather itself. It could be related to the underlying packages needed for calculations. Seems to be to tied into
parallel::makeCluster(no_clusters)
from what I have found on SO. I cannot find any references toparallel
so perhaps it is tied in themirai::daemons()
call?