Closed altintasali closed 3 years ago
Hello @altintasali,
thanks for your detailed report and tests, I could reproduce the error on Linux and Windows.
It seems like once the thread number is set in a R session it stays fixed, but when starting a new R session the number of threads can be set again (in my case I couldn't set it back to a single thread and multiple threads were used during the session). I'm not sure if this behavior is caused by a dependency update, maybe there is another way than RcppParallel::setThreadOptions
which allows to change this value dynamically during a session.
Seems to be the same issue as mentioned here: https://github.com/RcppCore/RcppParallel/issues/110#issuecomment-699622188
Hi @alexeckert,
Thanks for your input. I have also tried some other ways, but my attempts were unsuccessful. It seems like we need to wait for the next release of the RccpParallel. Hope it will be fixed properly.
RcppParallel 5.1.2 was just released to CRAN -- please let me know if you're still having any issues.
@kevinushey Thank you very much for your work. I've run some tests on windows and ubuntu and threads can be adjusted again without creating a new session. 👍
> library(parallelDist)
> sample.matrix <- matrix(c(1:500000), ncol = 10)
stem.time(parDist(x = sample.matrix, method = "euclidean", threads = 1))
system.time(parDist(x = sample.matrix, method = "euclidean", threads = 8))> system.time(parDist(x = sample.matrix, method = "euclidean", threads = 1))
user system elapsed
91.619 3.191 94.809
> system.time(parDist(x = sample.matrix, method = "euclidean", threads = 8))
user system elapsed
196.357 1.510 26.506
Great news; I'm glad to hear it! Sorry for the trouble in the interim.
Dear @alexeckert,
First of all, thank you so much for this amazing package. I have implemented parallelDist in most of my workflows.
While running parDist in both Ubuntu and MacOS, I have noticed that it uses only 1 thread although I set it to multiple threads. Therefore I decided to run a quick benchmark on the CPU times.
And here are the outputs from MacOS (2.8 GHz Quad-Core Intel Core i7)
and Ubuntu (Intel(R) Xeon(R) CPU E5-2620 v2 @ 2.10GHz)
I have seen that you have mentioned "Intel TBB lib" is need for multi-threading in this post. Therefore, I made sure that I have it.
As the results state, there is no run time difference across different threads. May I ask for your kind help to guide me through this issue? Even single threaded operations are way faster than the other distance functions in R. Therefore, I am so excited how the multi-threading can improve these run times.
In case that it might help, I am dropping my
sessionInfo()
information for both MacOS and Ubuntu.MacOS
Ubuntu