astamm / nevada

NEtwork-VAlued Data Analysis
https://astamm.github.io/nevada/
GNU General Public License v3.0
6 stars 2 forks source link

Make parallel test2_global, local and power #11

Open astamm opened 2 years ago

danielemule commented 2 years ago

I have parallelized power2() via futures (future.apply and progressr packages). Moreover, I have parallelized the distance matrix computation in dist_nvd() and, consequently, in test2_global() via futures (only for the "match-frobenius" distance).

You can find my version of these functions in my graphmatch branch of nevada (link).

astamm commented 2 years ago

I took a look at what you did. You learned the futureverse quite well but you have written code as an end-user would, not as a package developer. You should not have any call to future::plan() inside the code of the package. Furthermore, you chose to use the future.apply package which parallelises replicate() which is not optimal. Using furrr you parallelise purrr::map() which is already a speed improvement of replicate().

You should create a new branch on your fork specific to this issue and initiate a pull request. That way, I will be able to take over. Thanks for starting this!

danielemule commented 2 years ago

OK, thanks for your advice. As I understand it, it is not necessary to put any function argument related to parallelization tasks and so any call to future::plan() should be specified by the end-user before calling nevada functions, shouldn't it?

astamm commented 2 years ago

Correct. All you need to do in the code is calling furrr::future_map() or future.apply::replicate() instead of lapply() or replicate() and let the end-user decide whether (s)he wants to turn on parallelisation.