paobranco / UBL

An R package for utility-based learning
32 stars 11 forks source link

Error: Can not compute Euclidean distance with nominal attributes #10

Open BobMuenchen opened 4 years ago

BobMuenchen commented 4 years ago

Thanks for all your work on this useful package! I was surprised to see that Euclidean distance could not be used on a formula that contained only numeric variables. The function seems to care if the dataset contains factors, even when they're not used in the formula. That may be as designed, so I'm just reporting this in case you view it as an error.

library("UBL") Loading required package: MBA Loading required package: gstat Registered S3 method overwritten by 'xts': method from as.zoo.xts zoo Loading required package: automap Loading required package: sp Loading required package: randomForest randomForest 4.6-14 Type rfNews() to see new features/changes/bug fixes. library(MASS) data(cats) head(cats) Sex Bwt Hwt 1 F 2.0 7.0 2 F 2.0 7.4 3 F 2.0 9.5 4 F 2.1 7.2 5 F 2.1 7.3 6 F 2.1 7.6 length(cats$Sex) [1] 144

I'm adding a factor for color:

cats$color <- gl(n = 2, k=1, length = 144, label = c("black","white") ) head(cats) Sex Bwt Hwt color 1 F 2.0 7.0 black 2 F 2.0 7.4 white 3 F 2.0 9.5 black 4 F 2.1 7.2 white 5 F 2.1 7.3 black 6 F 2.1 7.6 white

I'm not using color, but it yields an error message anyway:

mysmote.cats <- SmoteClassif(Sex ~ Bwt + Hwt, cats, list(M = 0.8, F = 1.8)) Error in neighbours(tgt, dat, dist, p, k) : Can not compute Euclidean distance with nominal attributes!

HEOM fixes it:

mysmote.cats <- SmoteClassif(Sex ~ Bwt + Hwt, cats, list(M = 0.8, F = 1.8), dist = "HEOM")

sessionInfo() R version 3.6.2 (2019-12-12) Platform: x86_64-w64-mingw32/x64 (64-bit) Running under: Windows 10 x64 (build 18362)

Matrix products: default

locale: [1] LC_COLLATE=English_United States.1252 LC_CTYPE=English_United States.1252
[3] LC_MONETARY=English_United States.1252 LC_NUMERIC=C
[5] LC_TIME=English_United States.1252

attached base packages: [1] stats graphics grDevices utils datasets methods base

other attached packages: [1] UBL_0.0.6 randomForest_4.6-14 automap_1.0-14 sp_1.3-2 gstat_2.0-4
[6] MBA_0.0-9 MASS_7.3-51.4 devtools_2.2.1 usethis_1.5.1

loaded via a namespace (and not attached): [1] Rcpp_1.0.3 plyr_1.8.5 compiler_3.6.2 prettyunits_1.1.1 remotes_2.1.0 tools_3.6.2
[7] xts_0.11-2 testthat_2.3.1 digest_0.6.23 pkgbuild_1.0.6 pkgload_1.0.2 memoise_1.1.0
[13] lattice_0.20-38 rlang_0.4.4 cli_2.0.1.9000 rstudioapi_0.10 curl_4.3 withr_2.1.2
[19] desc_1.2.0 fs_1.3.1 rprojroot_1.3-2 grid_3.6.2 reshape_0.8.8 spacetime_1.2-2
[25] glue_1.3.1 R6_2.4.1 processx_3.4.1 fansi_0.4.1 sessioninfo_1.1.1 callr_3.4.0
[31] magrittr_1.5 intervals_0.15.1 backports_1.1.5 ps_1.3.0 ellipsis_0.3.0 assertthat_0.2.1 [37] FNN_1.1.3 crayon_1.3.4 zoo_1.8-7