centre-for-microbiome-research / GroopM

Metagenomic binning suite
GNU General Public License v3.0
29 stars 18 forks source link

No file written when extracting reads from bin #11

Closed mdehollander closed 9 years ago

mdehollander commented 9 years ago

I am running this command:

 groopm extract binning/groopm/database.gm -b 6 -m reads --threads 16 spades/bwa/*-s.bam -v
*******************************************************************************
 [[GroopM 0.3.4]] Running in 'reads' extraction mode...
*******************************************************************************
Loading data from: binning/groopm/database.gm
    GroopM DB version (5) up to date
    Loaded indices with condition: ((length >= 0) & (bid == 6))
    Working with: 619 contigs
    Loading coverage profiles
    Loading PCA kmer sigs (42 dimensional space)
    Loading PCA kmer variance (total variance: 0.80)
    Loading contig names
    Loading contig lengths (Total: 5893597 BP)
    Loading contig GC ratios (Average GC: 0.335)
    Creating color map
    Loading bin assignments
    Reticulating splines
    Making bin objects
    Loaded 1 bins from database
    { THIS: 0:00:01.000 || TOTAL: 0:00:01.001 }
Extracting reads
Thread_0 Preparing to extract reads from file: C1-s
Thread_1 Preparing to extract reads from file: C2-s
Thread_2 Preparing to extract reads from file: C3-s
Thread_3 Preparing to extract reads from file: C4-s
Thread_4 Preparing to extract reads from file: S1-s
Thread_6 Preparing to extract reads from file: S2-s
Thread_5 Preparing to extract reads from file: S3-s
Thread_7 Preparing to extract reads from file: SR1-s
Thread_8 Preparing to extract reads from file: SR2-s
Thread_10 Preparing to extract reads from file: SR3-s

but no files are written to disk.

The target bin is not empty:

groopm print binning/groopm/database.gm -b 6
"bin id"        "Likely chimeric"       "length (bp)"   "# seqs"        "GC mean"       "GC std"        "Coverage 1 mean"       "Coverage 1 std"        "Coverage 2 mean"       "Coverage 2 std"        "Coverage 3 mean"    "Coverage 3 std"        "Coverage 4 mean"       "Coverage 4 std"        "Coverage 5 mean"       "Coverage 5 std"        "Coverage 6 mean"       "Coverage 6 std"        "Coverage 7 mean"       "Coverage 7 std"     "Coverage 8 mean"       "Coverage 8 std"        "Coverage 9 mean"       "Coverage 9 std"        "Coverage 10 mean"      "Coverage 10 std"
6       False   5893597 619     0.3332  0.0222  0.0107  0.0790  0.0282  0.1230  0.4013  0.2556  1.0898  0.5213  0.0692  0.3295  1.6939  0.6566  1.6240  0.6854  3.5212  1.2434  0.0977  0.3936  0.0315  0.1601

When I look at the dumo output for contigs and see if there are any reads mapping to this contig in the sam file there is a match:

groopm print binning/groopm/database.gm -b 6 -f contigs
#"bid"  "cid"   "length"        "GC"
6       NODE_10007_length_2698_cov_3.04851_ID_20013     2698    0.3047

grep 'NODE_10007_length_2698_cov_3.04851_ID_20013' spades/bwa/*.sam
spades/bwa/SR2.sam:M01910:32:000000000-ADVL1:1:2116:6494:11842  163     NODE_10007_length_2698_cov_3.04851_ID_20013     584     60      214M    =       1045    713     GTAAAGTCGGCAATTGTAACCGATTTTATACTTTCTGTAGAAATTGTAATTATCGCTTTAGGAACCGTAATCGGAAAACCATTGGTTTCTCAAATTATCACAGTTTCAATCATCGCATTAATTGCAACAATTGGCGTTTACGGAATCGTAGCGCTTATTGTTCGTATGGATGAAGTAGGTTTCAAAATGATTAAAAGCAGTAAAAAAGAAAACA       CCCCCGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGFGGGGGGGGF@FGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGEGGGGGGGGGGGGFFFFCFGGGGGGGGGGDDGGGGFCGGEEGCGGGGGGGGGGFGD8=FGGGGGGGGGGGCFGGGGGGFDGGGFGGCFFGFFF5FFFA       NM:i:0  MD:Z:214        AS:i:214     XS:i:0

Is there a way to get the individual reads written to disk?

mdehollander commented 9 years ago

I realised that this is a BamM issue, so continue there