Open natgiot opened 2 years ago
This issue is to replace the merging
function with a novel one that would invoke the BBmap tool.
As a first step, a bbmap
function needs to be added in the preprocess.bds
script.
Then an if statement should be added in pema_latest.bds
and a parameter in the parameter files asking for the user which merging approach to use.
BBmap is available here: https://sourceforge.net/projects/bbmap/ it has been published in PloS One: https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5657622/ and adopted by a wide community, including the JGI (here is a guide in their website for bbmerge: https://jgi.doe.gov/data-and-tools/software-tools/bbtools/bb-tools-user-guide/bbmerge-guide/)
Depending on how the suite of tools may be integrated in Pema, merging can be achieved through bbmerge, but additional steps (trimming, adapter removal) may also be handled with the same package in a very fast and efficient way.
Thank you for considering adding the tool, it appears to handle better than pandaseq the merging step of fully overlapping reads (insert size equal to read length cases)!