Corentin-Gibert-Paleontology / DNCImper

R Package for PerSIMPER and DNCI analysis. Assembly process identification.
6 stars 3 forks source link

Multi-threads support #7

Open xphab opened 1 year ago

xphab commented 1 year ago

Dear Dr. Corentin Gibert,

Thanks very much for your great work, The DNCI method is a wonderful tool to evaluate the community assembly processes between different groups, especially the taxa without phylogenetic informations.

I wonder, is it possible to support multi-core operations. For when I use a big data (eg. ~300 sites (~10 groups) & 20k taxa/ASVs), it might take several months to get the result. Even I used a 144-threads cluster, the code only used ~3% CPUs.

Thanks again for your time.

Best, Peng

Corentin-Gibert-Paleontology commented 1 year ago

Dear Peng,

Thanks you very much for using PerSIMPER and DNCI, You are right when you explain that DNCImper package only use 1 CPU and that could be very problematic for (extra)large dataset as yours. I need to implement multicore computation process but I never found time to do it. The main idea should be to allow each CPU to work on one kind of permutation (so 3 core at a time) or each core deal with a part of the permutation loop (i.e. If you have 100 CPUs, each do 10 permutations to perform the default 1000 permutation), of finally, the mix of both solution. I can't promise that I will have time to implement this in the following weeks. I just moved to US for a new postdoc position few days ago.

If you know how to implement this kind of things in R, please try and keep me in touch. Otherwhise, I will try to do it in January or February, please ask again if I don"t give you any news in the following months, I am sorry for not being able to give you a solution quickly,

Best regards, Corentin