azmfaridee / mothur

This is GSoC2012 fork of 'Mothur'. We are trying to implement a number of 'Feature Selection' algorithms for microbial ecology data and incorporate them into mother's main codebase.
https://github.com/mothur/mothur
GNU General Public License v3.0
3 stars 1 forks source link

Investigate Mothur's Multithreading/Multiprocessing API #7

Open azmfaridee opened 12 years ago

azmfaridee commented 12 years ago

Mothur has a basically 4 ways to do a single task where Multithreading/Multiprocessing is concerned:

We would like to have detailed information on how each of the ways work. And possibly a sketch-down on how we'd like to incorporate those into our new sub-system.

Reference from @mothur-westcott's mail post

We actually have four, Single Process, fork(), CreateThread() and MPI. The fork() command is only used for linux/unix/mac machines. We also support parallelization on windows platforms using CreateThread(). The windows versions have shared memory, whereas fork() does not, this is something to consider in the design. A good simple example of mothur's use of fork(), CreateThread() and MPI is the summary.seqs command in summarycommand.cpp. When we are processing fasta files containing sequences we split the work by splitting the file as you noticed. With the shared file we often split the work by dividing by groups. Deciding how to divide the work is an important first step when thinking about parallelizing the code.

Child Issues: #8