Ecogenomics / BamM

Metagenomics-focused BAM file manipulation
http://ecogenomics.github.io/BamM/
GNU Lesser General Public License v3.0
16 stars 7 forks source link

No file written when extracting reads from bin #23

Closed mdehollander closed 9 years ago

mdehollander commented 9 years ago

I am using groopm to extract reads from bins, and I realize that the problem I have with that is caused by BamM: https://github.com/minillinim/GroopM/issues/11 So I think it is better to ask a more specific question here.

When I run BamM directly no output is generated:

bamm extract -g binning/groopm/database_bin_5.fna -b spades/bwa/C1-s.bam

But the test data works:

bamm extract -g ~/install/BamM/bamm/tests/modelling/data/bin1 -b ~/install/BamM/bamm/tests/modelling/contigs.pe.1.bam

So it is not an installation issue. I guess it is caused by the creation of the bam file. I made the bam files with samtools.

Is there a way to check if the bam file is ok?

wwood commented 9 years ago

Hi,

When you say no output is generated, what do you mean exactly? Are there any files created in $PWD?

I wonder if this might be due to differences in contig names in the fna and bam files. As a check, can you please run something like this please?

$ head -n1 binning/groopm/database_bin_5.fna
>contig256

(contig256 being made up for example's sake.). Then what is the output of below, replacing contig256 with the real contig name?

samtools view spades/bwa/C1-s.bam contig256: |head -n20

Thanks for narrowing the problem down from groopm to bamm and reporting as such, btw. ben

mdehollander commented 9 years ago

Hi,

What I mean is that there are no fq.gz files written to disk. There is output on the screen.

The sequences are found with samtools:

samtools view spades/bwa/C1-s.bam NODE_100233_length_617_cov_1.47782_ID_200467: | head -n 20
M01910:32:000000000-ADVL1:1:2113:8121:24184     2131    NODE_100233_length_617_cov_1.47782_ID_200467    2       0       33M268H NODE_152861_length_494_cov_1.80161_ID_305721    42      0       TCGGAGATGTGTATAAGAGACAGGTAATAATAT    .B<EF@@;3ECFFC<C3;8C8F9FFFFC=FFFB       NM:i:0  MD:Z:33 AS:i:33 XS:i:33 SA:Z:NODE_152861_length_494_cov_1.80161_ID_305721,42,-,23S66M3I16M10I25M2D51M107S,1,24; XA:Z:NODE_26574_length_1343_cov_1.03928_ID_53155,+1258,268S33M,0;NODE_152359_length_495_cov_2.08556_ID_304717,-44,32M269S,0;NODE_85792_length_672_cov_1.83485_ID_171585,+590,269S32M,0;

When I use the bam files generated by bamm there are fq.gz written to disk. I guess I used samtools with incorrect parameters. I will look into that, but for now bamm works :)

wwood commented 9 years ago

Actually, I think in this case bamm is working as expected. From your output, there is only 1 read that maps to that contig, and from the flags it is a supplementary alignment so not meant to be output.

As it currently stands, no output is written to stdout/stderr even when running successfully.