mlr-org / ParamHelpers

Helpers for parameters in black-box optimization, tuning and machine learning.
https://paramhelpers.mlr-org.com
Other
25 stars 9 forks source link

Reverse Dependency checks fail with new version of the lhs package #210

Closed bertcarnell closed 5 years ago

bertcarnell commented 5 years ago

I am developing a new version of the lhs package at https://github.com/bertcarnell/lhs which I will submit to CRAN soon. ParamHelpers imports lhs, so I was checking your package to ensure it wasn't broken. The tests in ParamHelpers 1.12 fail on R CMD check with the new lhs package. The following comes from the output of testthat in R CMD check

> library(testthat)
> library(BBmisc)

Attaching package: 'BBmisc'

The following object is masked from 'package:base':

       isFALSE

> library(ParamHelpers)
>
> test_check("ParamHelpers")

 *** caught segfault ***
address 0x340000013, cause 'memory not mapped'

Traceback:
1: generateDesign(50, par.set = ps)

I was able to create two reproducible examples with lhs 1.0. The problem appears to be in the generateDesign .Call. Something about what randomLHS is returning may have changed, but I'm not sure what. The fault does not occur in randomLHS.

require(ParamHelpers)
require(lhs)

# > sessionInfo()
# R version 3.5.1 (2018-07-02)
# Platform: x86_64-w64-mingw32/x64 (64-bit)
# Running under: Windows >= 8 x64 (build 9200)
#
# Matrix products: default
#
# locale:
#   [1] LC_COLLATE=English_United States.1252  LC_CTYPE=English_United States.1252
# [3] LC_MONETARY=English_United States.1252 LC_NUMERIC=C
# [5] LC_TIME=English_United States.1252
#
# attached base packages:
#   [1] stats     graphics  grDevices utils     datasets  methods   base
#
# other attached packages:
#   [1] lhs_1.0           ParamHelpers_1.12
#
# loaded via a namespace (and not attached):
#   [1] compiler_3.5.1  backports_1.1.3 tools_3.5.1     BBmisc_1.11     yaml_2.2.0      Rcpp_1.0.0
# [7] fastmatch_1.1-0 checkmate_1.9.1

# reproducible example 1
ps = makeParamSet(
  makeDiscreteParam("x", values = c("a", "b")),
  makeNumericParam("y", lower = 1, upper = 2, requires = quote(x == "a")),
  makeDiscreteParam("z", values = 1:2, requires = quote(x == "b"))
)
# Segfault occurs in generateDesign
des = generateDesign(50, par.set = ps)

# reproducible example 2
newdes = matrix(1, nrow = 1, ncol = 3)
newres = data.frame(V1 = "", V2 = 0, V3 = "", stringsAsFactors = FALSE)
types.int = as.integer(c(3,1,3))
lower2 = c(x = NA, y = 1, z = NA)
upper2 = c(x = NA, y = 2, z = NA)
values = list(`x` = c("a","b"),
              y = NULL,
              z = c("1","2"))

dyn.load(system.file("libs/x64/ParamHelpers.dll", package = "ParamHelpers"))
# segfault occurs in .Call
.Call("c_generateDesign", newdes, newres, types.int,
      lower2, upper2, values)

# This is the relevant code inside generateDesign

for (iter in seq_len(augment)) {
  newdes = if (nmissing == n)
    do.call(fun, insert(list(n = nmissing, k = k), fun.args))
  else lhs::randomLHS(nmissing, k = k)
  newres = makeDataFrame(nmissing, k, col.types = types.df)
  newres = .Call(c_generateDesign, newdes, newres, types.int,
                 lower2, upper2, values)

# This is my debugging session that led me to the reproducible example 2

Browse[2]>
  debug: newres = .Call(c_generateDesign, newdes, newres, types.int, lower2,
                        upper2, values)
Browse[2]> newdes
[,1] [,2] [,3]
[1,]    1    1    1
Browse[2]> newres
V1 V2 V3
1     0
Browse[2]> str(newres)
'data.frame':   1 obs. of  3 variables:
  $ V1: chr ""
$ V2: num 0
$ V3: chr ""
Browse[2]> types.int
[1] 3 1 3
Browse[2]> str(types.int)
int [1:3] 3 1 3
Browse[2]> lower2
x  y  z
NA  1 NA
Browse[2]> str(lower2)
Named num [1:3] NA 1 NA
- attr(*, "names")= chr [1:3] "x" "y" "z"
Browse[2]> upper2
x  y  z
NA  2 NA
Browse[2]> values
$`x`
[1] "a" "b"

$y
NULL

$z
[1] "1" "2"

Browse[2]>

# Segfault occurs here
mb706 commented 5 years ago

This happens because the C code makes the implicit assumption that randomLHS always returns values less than 1.0 (here and here, although the second one wouldn't segfault) , which is a bug.

This is coupled with a bug in the new lhs packet, which always returns a matrix of 1s if n == 1:

> lhs::randomLHS(1, 3)
     [,1] [,2] [,3]
[1,]    1    1    1
bertcarnell commented 5 years ago

@mb706 Thanks! I don't know why I didn't see that last night. I will fix that in randomLHS and check the reverse dependencies again.

bertcarnell commented 5 years ago

This issue is fixed in the lhs project 4cab08d80d812f7f3f74e2c7c34a1b0bf4431c09 for upcoming CRAN version 1.0