DerKevinRiehl / transposon_annotation_reasonaTE

Transposon annotation tool "resonaTE" (part of TransposonUltimate)
GNU General Public License v3.0
16 stars 1 forks source link

Multiple Threads #7

Closed KKFiu closed 2 years ago

KKFiu commented 2 years ago

Hi, thanks a lot for this interesting software. I wonder if I can run reasonaTE in multiple threads?

Thank you in advance, Wendy

DerKevinRiehl commented 2 years ago

Dear Wendy, very happy to see your interest in our software. :-)

In general, step 2 of the pipeline takes most of the time, which is about calling many different annotation tools. This step can be parallelized as follows: If you find the explanations of reasonaTE, "How to use "reasonaTE" Step 2, you will see commands to run different annotations tools.

In Linux environments, you can run commands in parallel by using & at the end of the line. This means the command is started, but not waited until finished execution. To wait until all previously "&"-called lines are finished, you can use wait. This way you could parallelize the call of different tools by following code:

reasonaTE -mode annotate -projectFolder workspace -projectName testProject -tool helitronScanner &
reasonaTE -mode annotate -projectFolder workspace -projectName testProject -tool ltrHarvest &
reasonaTE -mode annotate -projectFolder workspace -projectName testProject -tool mitefind &
reasonaTE -mode annotate -projectFolder workspace -projectName testProject -tool mitetracker &
reasonaTE -mode annotate -projectFolder workspace -projectName testProject -tool must &
reasonaTE -mode annotate -projectFolder workspace -projectName testProject -tool repeatmodel &
reasonaTE -mode annotate -projectFolder workspace -projectName testProject -tool repMasker &
reasonaTE -mode annotate -projectFolder workspace -projectName testProject -tool sinefind &
reasonaTE -mode annotate -projectFolder workspace -projectName testProject -tool sinescan &
reasonaTE -mode annotate -projectFolder workspace -projectName testProject -tool tirvish &
reasonaTE -mode annotate -projectFolder workspace -projectName testProject -tool transposonPSI &
reasonaTE -mode annotate -projectFolder workspace -projectName testProject -tool NCBICDD1000 &
wait

Of course, depending on your computer / cluster and computation equipment you potentially cannot execute all of them at the same time, as your RAM is limited.

The other parts of the pipeline will work in parallel already.

Hope this answer could help you, please let me know if you were manage to parallelize it. Best regards, Kevin Riehl

KKFiu commented 2 years ago

Dear @DerKevinRiehl , Thanks for your reply, that's a good way to speed up the process. However, maybe my expression was not clear, I actually wonder if there is way to specify the number of parallel search jobs to run, such as using -pa 3 or -pa 8 in RepeatModeler and other annotation tools? Sincerely, Wendy

DerKevinRiehl commented 2 years ago

Dear Wendy, sorry for being inprecise here. Currently, there is no such option, but it will be considered in the next big update.

Best regards and once again thanks for your interest, Best regards, Kevin