Ecogenomics / BamM

Metagenomics-focused BAM file manipulation
http://ecogenomics.github.io/BamM/
GNU Lesser General Public License v3.0
16 stars 7 forks source link

bamm extract: bamm-specific header prefixes are not compatible with downstream analyses tools #49

Open jvollme opened 7 years ago

jvollme commented 7 years ago

When extracting mapped reads from a bam file, bamm preserves information on read partners between groups by adding a prefix with that information to the extracted read-id. While this feature can often be very helpful, is there an option to turn it off? Some tools (such as the assembler spades) have problems with the fact that the forward & reverse reads of extracted pairs now seemingly have different names.

Or could alternatively this information be added as a suffix, separated from the rest of the read by a space? Since most tools simply use the the first string in the id-line up to the first whitespace as read-id, this would fix the problem for many downstream tools.