BRANCHlab / metasnf

Scalable subtyping with similarity network fusion
https://branchlab.github.io/metasnf/
Other
5 stars 0 forks source link

parallelization? #3

Closed apdlbalb closed 10 months ago

apdlbalb commented 10 months ago

Have you considered parallelizing execute_design_matrix with the doParallel package to accommodate large datasets? (Maybe in accordance with the data.table package to make this function run faster)

pvelayudhan commented 10 months ago

This is actually already a feature :)! Just not a well documented one.

If you install off the main branch, execute_design_matrix has a parameter called "processes" which by default is set to 1 (sequential). If you instead pass in processes = "max" (or some other number less or equal to the number of cores? available on your machine) the function will run with parallel processing. This speeds thing up fantastically but you are no longer able to see the % completion progress in the console.

Relevant line: https://github.com/BRANCHlab/metasnf/blob/9be600fa3923e91c09686b6dbfa51fd73dec0d78/R/execute.R#L15