rmarko / CORElearn

R package CORElearn
3 stars 0 forks source link

attrEval not using all cores when estimator = "RReliefFexpRank" #1

Open kianBlanchette opened 4 months ago

kianBlanchette commented 4 months ago

The issue is as described in the title and an example is given below. I'm not sure if this is expected due to algorithmic differences between the estimators, but it would be good to know.

library(data.table) library(CORElearn) library(microbenchmark)

n <- 2^12 dt1 <- data.table(y = sample(c(0,1), n, replace = TRUE), x = rnorm(n)) dt2 <- copy(dt1) dt2[, y := as.factor(y)]

attrEval(y ~ x, data = dt1, estimator="RReliefFexpRank") attrEval(y ~ x, data = dt2, estimator="ReliefFexpRank")

microbenchmark(attrEval(y ~ x, data = dt1, estimator="RReliefFexpRank"), attrEval(y ~ x, data = dt2, estimator="ReliefFexpRank"), times = 10)

When this is run (both on a virtual desktop with few cores and a GPU with 40 cores) the task manager is showing only ~3% CPU use for RReliefFexpRank and between 70 and 99% for ReliefFexpRank. The benchmark results have a noticeable difference as well: Unit: milliseconds expr min lq mean median uq attrEval(y ~ x, data = dt1, estimator = "RReliefFexpRank") 675.2150 676.4049 677.3993 677.3904 678.7969 attrEval(y ~ x, data = dt2, estimator = "ReliefFexpRank") 169.4725 178.2430 187.5567 186.6181 191.1521 max neval 678.8873 10 216.4635 10

rmarko commented 4 months ago

Dear Kian, the RReliefF does not support multithreading due to relatively little interest in this estimator. Best regards, Marko


Prof Marko Robnik-Sikonja, Ph.D. Head of Laboratory for Cognitive Modeling University of Ljubljana, Faculty of Computer and Information Science e-mail: @.**@.> web: https://fri.uni-lj.si/en/employees/marko-robnik-sikonja

From: Kian Blanchette @.> Sent: Friday, 19 April 2024 15:36 To: rmarko/CORElearn @.> Cc: Subscribed @.***> Subject: [rmarko/CORElearn] attrEval not using all cores when estimator = "RReliefFexpRank" (Issue #1)

The issue is as described in the title and an example is given below. I'm not sure if this is expected due to algorithmic differences between the estimators, but it would be good to know.

library(data.table) library(CORElearn) library(microbenchmark)

n <- 2^12 dt1 <- data.table(y = sample(c(0,1), n, replace = TRUE), x = rnorm(n)) dt2 <- copy(dt1) dt2[, y := as.factor(y)]

attrEval(y ~ x, data = dt1, estimator="RReliefFexpRank") attrEval(y ~ x, data = dt2, estimator="ReliefFexpRank")

microbenchmark(attrEval(y ~ x, data = dt1, estimator="RReliefFexpRank"), attrEval(y ~ x, data = dt2, estimator="ReliefFexpRank"), times = 10)

When this is run (both on a virtual desktop with few cores and a GPU with 40 cores) the task manager is showing only ~3% CPU use for RReliefFexpRank and between 70 and 99% for ReliefFexpRank. The benchmark results have a noticeable difference as well: Unit: milliseconds expr min lq mean median uq attrEval(y ~ x, data = dt1, estimator = "RReliefFexpRank") 675.2150 676.4049 677.3993 677.3904 678.7969 attrEval(y ~ x, data = dt2, estimator = "ReliefFexpRank") 169.4725 178.2430 187.5567 186.6181 191.1521 max neval 678.8873 10 216.4635 10

— Reply to this email directly, view it on GitHubhttps://github.com/rmarko/CORElearn/issues/1, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AAUDNMLV3RU7JVUONWH472TY6EMTRAVCNFSM6AAAAABGPGTH62VHI2DSMVQWIX3LMV43ASLTON2WKOZSGI2TGMBTHEYDONQ. You are receiving this because you are subscribed to this thread.Message ID: @.**@.>>

kianBlanchette commented 4 months ago

Would it be too much trouble to implement multithreading for RReliefF? If I'm the only person asking for it, I understand that it might not be worth it for you, but it would be really helpful