VDBWRAIR / ngs_mapper

Genome Mapping Pipeline
GNU General Public License v2.0
8 stars 10 forks source link

graphsample.py did not exit successfully #96

Closed mmelendrez closed 9 years ago

mmelendrez commented 9 years ago

So I'm testing your pipeline on some new stuff. I got it to run successfully with Chik a full genome virus like Dengue. I am trying to run it against another segmented virus and it failed:

my command and screen output:

(ngs_mapper) [melanie.melendrez@amedwrair15380 ngs_mapper]$ runsample.py ../FVM00003/ ../References/Andes__Chile__2002.fasta -od FVM00003_Andes FVM00003_Andes
2015-02-05 10:00:53,853 -- INFO -- runsample       --- Starting FVM00003_Andes --- 
2015-02-05 10:01:09,775 -- CRITICAL -- runsample       graphsample.py /tmp/FVM00003_AndesZYDTDDrunsample/FVM00003_Andes.bam -od /tmp/FVM00003_AndesZYDTDDrunsample did not exit sucessfully
2015-02-05 10:01:13,288 -- CRITICAL -- runsample       !!! There was an error running part of the pipeline !!!
2015-02-05 10:01:13,288 -- CRITICAL -- runsample       Please check the logfile /tmp/FVM00003_AndesZYDTDDrunsample/FVM00003_Andes.log

FVM00003_Andes.log

2015-02-05 10:00:53,853 -- DEBUG -- runsample       Initialized empty Git repository in /tmp/FVM00003_AndesZYDTDDrunsample/.git/

2015-02-05 10:00:53,853 -- INFO -- runsample       --- Starting FVM00003_Andes --- 
2015-02-05 10:00:53,853 -- DEBUG -- runsample       Copying reference file ../References/Andes__Chile__2002.fasta to /tmp/FVM00003_AndesZYDTDDrunsample/Andes__Chile__2002.fasta
2015-02-05 10:00:53,875 -- DEBUG -- runsample       Running trim_reads.py ../FVM00003/ -q 20 -o /tmp/FVM00003_AndesZYDTDDrunsample/trimmed_reads --head-crop 0
2015-02-05 10:00:57,834 -- DEBUG -- runsample       Running run_bwa_on_samplename.py /tmp/FVM00003_AndesZYDTDDrunsample/trimmed_reads /tmp/FVM00003_AndesZYDTDDrunsample/Andes__Chile__2002.fasta -o /tmp/FVM00003_AndesZYDTDDrunsample/FVM00003_Andes.bam
2015-02-05 10:01:04,132 -- DEBUG -- runsample       Running tagreads.py /tmp/FVM00003_AndesZYDTDDrunsample/FVM00003_Andes.bam -CN WRAIR
2015-02-05 10:01:08,163 -- DEBUG -- runsample       Running base_caller.py /tmp/FVM00003_AndesZYDTDDrunsample/FVM00003_Andes.bam /tmp/FVM00003_AndesZYDTDDrunsample/Andes__Chile__2002.fasta /tmp/FVM00003_AndesZYDTDDrunsample/FVM00003_Andes.bam.vcf -minth 0.8
2015-02-05 10:01:09,210 -- DEBUG -- runsample       Running samtools flagstat /tmp/FVM00003_AndesZYDTDDrunsample/FVM00003_Andes.bam
2015-02-05 10:01:09,293 -- DEBUG -- runsample       Running graphsample.py /tmp/FVM00003_AndesZYDTDDrunsample/FVM00003_Andes.bam -od /tmp/FVM00003_AndesZYDTDDrunsample
2015-02-05 10:01:09,775 -- CRITICAL -- runsample       graphsample.py /tmp/FVM00003_AndesZYDTDDrunsample/FVM00003_Andes.bam -od /tmp/FVM00003_AndesZYDTDDrunsample did not exit sucessfully
2015-02-05 10:01:09,776 -- DEBUG -- runsample       Running fqstats.py -o /tmp/FVM00003_AndesZYDTDDrunsample/FVM00003_Andes.reads.png /tmp/FVM00003_AndesZYDTDDrunsample/trimmed_reads/FVM00003.IonXpress_011.Auto_user_SN2-13-02022015_47_019.fastq
2015-02-05 10:01:12,645 -- DEBUG -- runsample       Running vcf_consensus.py /tmp/FVM00003_AndesZYDTDDrunsample/FVM00003_Andes.bam.vcf -i FVM00003_Andes -o /tmp/FVM00003_AndesZYDTDDrunsample/FVM00003_Andes.bam.consensus.fasta
2015-02-05 10:01:13,288 -- CRITICAL -- runsample       !!! There was an error running part of the pipeline !!!
2015-02-05 10:01:13,288 -- CRITICAL -- runsample       Please check the logfile /tmp/FVM00003_AndesZYDTDDrunsample/FVM00003_Andes.log

pipeline.log

2015-02-05 09:52:36,905 -- DEBUG -- trim_reads      Using /tmp/FVM00003_ChikhMiksCrunsample/trimmed_reads/FVM00003.IonXpress_011.Auto_user_SN2-13-02022015_47_019.fastq as the output path
2015-02-05 09:52:36,908 -- DEBUG -- trim_reads      Running java -jar /media/VD_Research/People/Melanie.Melendrez/.ngs_mapper/lib/Trimmomatic-0.32/trimmomatic-0.32.jar SE -threads 1 -trimlog /tmp/FVM00003_ChikhMiksCrunsample/trim_stats/FVM00003.IonXpress_011.Auto_user_SN2-13-02022015_47_019.fastq.trim_stats ../FVM00003/FVM00003.IonXpress_011.Auto_user_SN2-13-02022015_47_019.fastq /tmp/FVM00003_ChikhMiksCrunsample/trimmed_reads/FVM00003.IonXpress_011.Auto_user_SN2-13-02022015_47_019.fastq LEADING:20 TRAILING:20 HEADCROP:0
2015-02-05 09:52:54,128 -- INFO -- tagreads        Gathering existing header for /tmp/FVM00003_ChikhMiksCrunsample/FVM00003_Chik.bam
2015-02-05 09:52:54,133 -- INFO -- tagreads        Tagging reads for /tmp/FVM00003_ChikhMiksCrunsample/FVM00003_Chik.bam
2015-02-05 09:52:54,155 -- DEBUG -- tagreads        Skipping read BNB87:00825:01014 because it is supplementary
2015-02-05 09:52:56,868 -- INFO -- tagreads        Finished tagging reads for /tmp/FVM00003_ChikhMiksCrunsample/FVM00003_Chik.bam
2015-02-05 09:52:56,868 -- INFO -- tagreads        Sorting /tmp/FVM00003_ChikhMiksCrunsample/FVM00003_Chik.bam
2015-02-05 09:52:59,342 -- INFO -- tagreads        Indexing /tmp/FVM00003_ChikhMiksCrunsample/FVM00003_Chik.bam

bwa.log

DEBUG:ngs_mapper.data:Reading first read from /tmp/FVM00003_AndesZYDTDDrunsample/trimmed_reads/FVM00003.IonXpress_011.Auto_user_SN2-13-02022015_47_019.fastq
DEBUG:ngs_mapper.data:Reading first read from /tmp/FVM00003_AndesZYDTDDrunsample/trimmed_reads/FVM00003.IonXpress_011.Auto_user_SN2-13-02022015_47_019.fastq
DEBUG:ngs_mapper.data:Reading first read from /tmp/FVM00003_AndesZYDTDDrunsample/trimmed_reads/FVM00003.IonXpress_011.Auto_user_SN2-13-02022015_47_019.fastq
DEBUG:ngs_mapper.data:Reading first read from /tmp/FVM00003_AndesZYDTDDrunsample/trimmed_reads/FVM00003.IonXpress_011.Auto_user_SN2-13-02022015_47_019.fastq
DEBUG:ngs_mapper.data:Attempting to pair readlist ['/tmp/FVM00003_AndesZYDTDDrunsample/trimmed_reads/FVM00003.IonXpress_011.Auto_user_SN2-13-02022015_47_019.fastq']
INFO:ngs_mapper.data:paired_reads: []
DEBUG:ngs_mapper.data:i: 0
DEBUG:ngs_mapper.data:Cannot find _R1_ or _R2_ in /tmp/FVM00003_AndesZYDTDDrunsample/trimmed_reads/FVM00003.IonXpress_011.Auto_user_SN2-13-02022015_47_019.fastq
DEBUG:ngs_mapper.data:Index in readlist for /tmp/FVM00003_AndesZYDTDDrunsample/trimmed_reads/FVM00003.IonXpress_011.Auto_user_SN2-13-02022015_47_019.fastq is -1
DEBUG:ngs_mapper.data:Index was -1 so appending readfile
DEBUG:ngs_mapper.run_bwa:Reads parsed by platform: {'Sanger': ['/tmp/FVM00003_AndesZYDTDDrunsample/trimmed_reads/FVM00003.IonXpress_011.Auto_user_SN2-13-02022015_47_019.fastq']}
INFO:ngs_mapper.reads:Compiling reads from ['/tmp/FVM00003_AndesZYDTDDrunsample/trimmed_reads/FVM00003.IonXpress_011.Auto_user_SN2-13-02022015_47_019.fastq'] into /tmp/FVM00003_AndesZYDTDDrunsample/bwa/reads/NP.fq
DEBUG:ngs_mapper.reads:Concating all files ['/tmp/FVM00003_AndesZYDTDDrunsample/trimmed_reads/FVM00003.IonXpress_011.Auto_user_SN2-13-02022015_47_019.fastq'] to /tmp/FVM00003_AndesZYDTDDrunsample/bwa/reads/NP.fq
DEBUG:ngs_mapper.run_bwa:Ensuring /tmp/FVM00003_AndesZYDTDDrunsample/Andes__Chile__2002.fasta is indexed
DEBUG:bwa.bwa:Indexes found for /tmp/FVM00003_AndesZYDTDDrunsample/Andes__Chile__2002.fasta: []
DEBUG:bwa.bwa:BWA path not specified so using default  path /media/VD_Research/People/Melanie.Melendrez/.ngs_mapper/bin/bwa
INFO:bwa.bwa:Indexing /tmp/FVM00003_AndesZYDTDDrunsample/Andes__Chile__2002.fasta
INFO:bwa.bwa:Running /media/VD_Research/People/Melanie.Melendrez/.ngs_mapper/bin/bwa index /tmp/FVM00003_AndesZYDTDDrunsample/Andes__Chile__2002.fasta
DEBUG:bwa.bwa:STDERR: [bwa_index] Pack FASTA... 0.00 sec
[bwa_index] Construct BWT for the packed sequence...
[bwa_index] 0.00 seconds elapse.
[bwa_index] Update BWT... 0.00 sec
[bwa_index] Pack forward-only FASTA... 0.00 sec
[bwa_index] Construct SA from BWT and Occ... 0.00 sec
[main] Version: 0.7.5-r404
[main] CMD: /media/VD_Research/People/Melanie.Melendrez/.ngs_mapper/bin/bwa index /tmp/FVM00003_AndesZYDTDDrunsample/Andes__Chile__2002.fasta
[main] Real time: 0.021 sec; CPU: 0.004 sec

INFO:bwa.bwa:bwa index ran on /tmp/FVM00003_AndesZYDTDDrunsample/Andes__Chile__2002.fasta
DEBUG:bwa.bwa:Indexes found for /tmp/FVM00003_AndesZYDTDDrunsample/Andes__Chile__2002.fasta: ['/tmp/FVM00003_AndesZYDTDDrunsample/Andes__Chile__2002.fasta.ann', '/tmp/FVM00003_AndesZYDTDDrunsample/Andes__Chile__2002.fasta.bwt', '/tmp/FVM00003_AndesZYDTDDrunsample/Andes__Chile__2002.fasta.sa', '/tmp/FVM00003_AndesZYDTDDrunsample/Andes__Chile__2002.fasta.pac', '/tmp/FVM00003_AndesZYDTDDrunsample/Andes__Chile__2002.fasta.amb']
INFO:bwa.bwa:Running /media/VD_Research/People/Melanie.Melendrez/.ngs_mapper/bin/bwa mem -t 1 /tmp/FVM00003_AndesZYDTDDrunsample/Andes__Chile__2002.fasta /tmp/FVM00003_AndesZYDTDDrunsample/bwa/reads/NP.fq
DEBUG:bwa.bwa:STDERR: [M::main_mem] read 81329 sequences (7888392 bp)...
[main] Version: 0.7.5-r404
[main] CMD: /media/VD_Research/People/Melanie.Melendrez/.ngs_mapper/bin/bwa mem -t 1 /tmp/FVM00003_AndesZYDTDDrunsample/Andes__Chile__2002.fasta /tmp/FVM00003_AndesZYDTDDrunsample/bwa/reads/NP.fq
[main] Real time: 2.257 sec; CPU: 2.180 sec

INFO:ngs_mapper.bam:Running samtools view -Sb -
DEBUG:ngs_mapper.bam:CMD: ['samtools', 'view', '-Sb', '-'] STDIN: <open file '/tmp/FVM00003_AndesZYDTDDrunsample/bwa/nonpaired.sai', mode 'r' at 0x7ffb0ddbf9c0> STDOUT: -1
DEBUG:ngs_mapper.bam:Returning processes stdout value
INFO:ngs_mapper.bam:Running samtools sort -f -
DEBUG:ngs_mapper.bam:CMD: ['samtools', 'sort', '-f', '-', '/tmp/FVM00003_AndesZYDTDDrunsample/bwa/nonpaired.bam'] STDIN: <open file '<fdopen>', mode 'rb' at 0x7ffb0dd129c0>
[samopen] SAM header is present: 3 sequences.
[bam_header_read] EOF marker is absent. The input is probably truncated.
DEBUG:ngs_mapper.run_bwa:Paired only. Moving result file /tmp/FVM00003_AndesZYDTDDrunsample/bwa/nonpaired.bam to /tmp/FVM00003_AndesZYDTDDrunsample/FVM00003_Andes.bam
INFO:ngs_mapper.bam:Running samtools index /tmp/FVM00003_AndesZYDTDDrunsample/FVM00003_Andes.bam
Ret: 0

std.log

2015-02-05 10:01:04,412 -- INFO -- tagreads        Gathering existing header for /tmp/FVM00003_AndesZYDTDDrunsample/FVM00003_Andes.bam
2015-02-05 10:01:04,416 -- INFO -- tagreads        Tagging reads for /tmp/FVM00003_AndesZYDTDDrunsample/FVM00003_Andes.bam
2015-02-05 10:01:06,160 -- INFO -- tagreads        Finished tagging reads for /tmp/FVM00003_AndesZYDTDDrunsample/FVM00003_Andes.bam
2015-02-05 10:01:06,160 -- INFO -- tagreads        Sorting /tmp/FVM00003_AndesZYDTDDrunsample/FVM00003_Andes.bam
[samopen] SAM header is present: 3 sequences.
[bam_header_read] EOF marker is absent. The input is probably truncated.
2015-02-05 10:01:08,069 -- INFO -- tagreads        Indexing /tmp/FVM00003_AndesZYDTDDrunsample/FVM00003_Andes.bam
[mpileup] 1 samples in 1 input files
montage: missing an image filename `/tmp/FVM00003_AndesZYDTDDrunsample/FVM00003_Andes.bam.qualdepth.png' @ montage.c/MontageImageCommand/1649.
Traceback (most recent call last):
  File "/media/VD_Research/People/Melanie.Melendrez/.ngs_mapper/bin/graphsample.py", line 4, in <module>
    __import__('pkg_resources').run_script('ngs-mapper==1.1.0', 'graphsample.py')
  File "build/bdist.linux-x86_64/egg/pkg_resources.py", line 517, in run_script
    distributions in the working set, otherwise only ones matching
  File "build/bdist.linux-x86_64/egg/pkg_resources.py", line 1423, in run_script
    return []
  File "/media/VD_Research/People/Melanie.Melendrez/.ngs_mapper/lib/python2.7/site-packages/ngs_mapper-1.1.0-py2.7.egg/EGG-INFO/scripts/graphsample.py", line 6, in <module>
    main(parse_args())
  File "/media/VD_Research/People/Melanie.Melendrez/.ngs_mapper/lib/python2.7/site-packages/ngs_mapper-1.1.0-py2.7.egg/ngs_mapper/graphsample.py", line 21, in main
    pngfile = make_image( jfile, args.outpath )
  File "/media/VD_Research/People/Melanie.Melendrez/.ngs_mapper/lib/python2.7/site-packages/ngs_mapper-1.1.0-py2.7.egg/ngs_mapper/graphsample.py", line 48, in make_image
    run_montage( *imagelist, compress='JPEG', quality=25, geometry='+1+1' )
  File "/media/VD_Research/People/Melanie.Melendrez/.ngs_mapper/lib/python2.7/site-packages/ngs_mapper-1.1.0-py2.7.egg/ngs_mapper/graphsample.py", line 81, in run_montage
    subprocess.check_call( cmd )
  File "/home/AMED/melanie.melendrez/lib/python2.7/subprocess.py", line 540, in check_call
    raise CalledProcessError(retcode, cmd)
subprocess.CalledProcessError: Command '['montage', '-geometry', '+1+1', '-quality', '25', '-compress', 'JPEG', '/tmp/FVM00003_AndesZYDTDDrunsample/FVM00003_Andes.bam.qualdepth.png']' returned non-zero exit status 1

graphsample.log

2015-02-05 10:01:09,749 -- DEBUG -- graphsample.py  Running montage -geometry +1+1 -quality 25 -compress JPEG /tmp/FVM00003_AndesZYDTDDrunsample/FVM00003_Andes.bam.qualdepth.png

All output is currently located on my computer under /tmp/FVM00003_AndesZYDTDDrunsample/

I wouldn't be too surprised if it failed because nothing matched to this virus I just wanted to make sure it wasn't a pipeline issue. They told me this virus serology matched Andes virus, so I pulled the segmented genome - formatted the reference to look like our flu genomes and ran the pipeline on it. Pathogen discovery said this was Chikungunya - so I also ran Chik and that seemed to finish succesffully. Just didn't know what this error meant.

necrolyte2 commented 9 years ago

I'm not really 100% sure what this error means, but graphsample.py is failing because when it tries to run the montage command in the shell it is getting an error

Can you look to see if any .png files are anywhere in your project in /tmp/FVM00003_*

montage: missing an image filename `/tmp/FVM00003_AndesZYDTDDrunsample/FVM00003_Andes.bam.qualdepth.png' @ montage.c/MontageImageCommand/1649.
Traceback (most recent call last):
  File "/media/VD_Research/People/Melanie.Melendrez/.ngs_mapper/bin/graphsample.py", line 4, in <module>
    __import__('pkg_resources').run_script('ngs-mapper==1.1.0', 'graphsample.py')
  File "build/bdist.linux-x86_64/egg/pkg_resources.py", line 517, in run_script
    distributions in the working set, otherwise only ones matching
  File "build/bdist.linux-x86_64/egg/pkg_resources.py", line 1423, in run_script
    return []
  File "/media/VD_Research/People/Melanie.Melendrez/.ngs_mapper/lib/python2.7/site-packages/ngs_mapper-1.1.0-py2.7.egg/EGG-INFO/scripts/graphsample.py", line 6, in <module>
    main(parse_args())
  File "/media/VD_Research/People/Melanie.Melendrez/.ngs_mapper/lib/python2.7/site-packages/ngs_mapper-1.1.0-py2.7.egg/ngs_mapper/graphsample.py", line 21, in main
    pngfile = make_image( jfile, args.outpath )
  File "/media/VD_Research/People/Melanie.Melendrez/.ngs_mapper/lib/python2.7/site-packages/ngs_mapper-1.1.0-py2.7.egg/ngs_mapper/graphsample.py", line 48, in make_image
    run_montage( *imagelist, compress='JPEG', quality=25, geometry='+1+1' )
  File "/media/VD_Research/People/Melanie.Melendrez/.ngs_mapper/lib/python2.7/site-packages/ngs_mapper-1.1.0-py2.7.egg/ngs_mapper/graphsample.py", line 81, in run_montage
    subprocess.check_call( cmd )
  File "/home/AMED/melanie.melendrez/lib/python2.7/subprocess.py", line 540, in check_call
    raise CalledProcessError(retcode, cmd)
subprocess.CalledProcessError: Command '['montage', '-geometry', '+1+1', '-quality', '25', '-compress', 'JPEG', '/tmp/FVM00003_AndesZYDTDDrunsample/FVM00003_Andes.bam.qualdepth.png']' returned non-zero exit status 1
mmelendrez commented 9 years ago

no pngs anywhere all that outputted was graphsample.log and pipeline.log.

There is one functioning png in the /tmp directory - the *.reads.png. I couldn't find any other pngs

(ngs_mapper) [melanie.melendrez@amedwrair15380 FVM00003]$ ls /tmp/FVM00003_AndesZYDTDDrunsample/
Andes__Chile__2002.fasta      Andes__Chile__2002.fasta.pac  FVM00003_Andes.bam                  FVM00003_Andes.bam.vcf    qualdepth
Andes__Chile__2002.fasta.amb  Andes__Chile__2002.fasta.sa   FVM00003_Andes.bam.bai              FVM00003_Andes.log        trimmed_reads
Andes__Chile__2002.fasta.ann  bwa.log                       FVM00003_Andes.bam.consensus.fasta  FVM00003_Andes.reads.png  trim_stats
Andes__Chile__2002.fasta.bwt  flagstats.txt                 FVM00003_Andes.bam.qualdepth.json   FVM00003_Andes.std.log
necrolyte2 commented 9 years ago

Can you check in that qualdepth directory as well

mmelendrez commented 9 years ago

I did, no pngs - the directory is empty

(ngs_mapper) [melanie.melendrez@amedwrair15380 FVM00003]$ ls /tmp/FVM00003_AndesZYDTDDrunsample/qualdepth/
(ngs_mapper) [melanie.melendrez@amedwrair15380 FVM00003]$ ls /tmp/FVM00003_AndesZYDTDDrunsample/qualdepth/
necrolyte2 commented 9 years ago

Can you post the contents of flagstats.txt, I highly suspect that you are right that maybe no reads actually mapped or something. Once we figure out the issue, i'll make a bug report because the pipeline needs to check for these things

mmelendrez commented 9 years ago

ok - ya I looked in flagstats but couldn't figure what was going on with it: flagstats

81329 + 0 in total (QC-passed reads + QC-failed reads)                                                                                     
0 + 0 duplicates
0 + 0 mapped (0.00%:-nan%)
0 + 0 paired in sequencing
0 + 0 read1
0 + 0 read2
0 + 0 properly paired (-nan%:-nan%)
0 + 0 with itself and mate mapped
0 + 0 singletons (-nan%:-nan%)
0 + 0 with mate mapped to a different chr
0 + 0 with mate mapped to a different chr (mapQ>=5)
necrolyte2 commented 9 years ago

Haha, well basically there were 81,329 reads and 0 of them mapped

81329 + 0 in total (QC-passed reads + QC-failed reads)                                                                                     
0 + 0 duplicates
0 + 0 mapped (0.00%:-nan%)
mmelendrez commented 9 years ago

ahhhh

ok - ya i figured. but at least we know an area of the pipeline we can improve for clarity.