hall-lab / speedseq

A flexible framework for rapid genome analysis and interpretation
MIT License
312 stars 116 forks source link

structural variant calling (lumpy) error TypeError: %d format: a number is required, not numpy.float64 #91

Closed crazyhottommy closed 7 years ago

crazyhottommy commented 7 years ago

I am using speedseq v0.1.0 and got an error in the structural variant calling step:

Calculating insert distributions... 
sambamba-view:  (Broken pipe)
Library read groups: 140517_SN1440_0189_BC41CUACXX_4_CAGATCTG
Library read length: 51
sambamba-view: unable to write to stream
/scratch/genomic_med/apps/python/anaconda/default/lib/python2.7/site-packages/numpy/core/_methods.py:59: RuntimeWarning: Mean of
  warnings.warn("Mean of empty slice.", RuntimeWarning)
/scratch/genomic_med/apps/python/anaconda/default/lib/python2.7/site-packages/numpy/core/_methods.py:70: RuntimeWarning: invalid
  ret = ret.dtype.type(ret / rcount)
Traceback (most recent call last):
  File "/scratch/genomic_med/apps/lumpy/default//scripts/pairend_distro.py", line 106, in <module>
    (removed, upper_cutoff))
TypeError: %d format: a number is required, not numpy.float64
END at Thu Sep 22 16:45:11 CDT 2016

Thanks, Ming

cc2qe commented 7 years ago

Hi Ming,

Can you download the latest dev svtyper and run the following?

svtyper -B my.bam -wl my.diagnostic.json

Then please either post the resulting JSON file here or email it to me

Thanks!

cc2qe commented 7 years ago

If you could also post the BAM header it would be helpful

Usually in these scenarios there is a problem with the BAM file (missing libraries, single-end reads, no mapped reads, etc)

crazyhottommy commented 7 years ago
./svtyper -B my_realigned.bam -wl my.diagnostic.json

Warning: VCF not found.

Calculating library statistics...
Error: failed to build insert size histogram for paired-end reads.
Please ensure BAM file (my_realigned.bam) has inward facing, paired-end reads.

header of the bam

@HD     VN:1.3  SO:coordinate
@SQ     SN:1    LN:249250621
@SQ     SN:2    LN:243199373
@SQ     SN:3    LN:198022430
@SQ     SN:4    LN:191154276
@SQ     SN:5    LN:180915260
@SQ     SN:6    LN:171115067
@SQ     SN:7    LN:159138663
@SQ     SN:8    LN:146364022
@SQ     SN:9    LN:141213431
@SQ     SN:10   LN:135534747
@SQ     SN:11   LN:135006516
@SQ     SN:12   LN:133851895
@SQ     SN:13   LN:115169878
@SQ     SN:14   LN:107349540
@SQ     SN:15   LN:102531392
@SQ     SN:16   LN:90354753
@SQ     SN:17   LN:81195210
@SQ     SN:18   LN:78077248
@SQ     SN:19   LN:59128983
@SQ     SN:20   LN:63025520
@SQ     SN:21   LN:48129895
@SQ     SN:22   LN:51304566
@SQ     SN:X    LN:155270560
@SQ     SN:Y    LN:59373566
@SQ     SN:MT   LN:16569
@SQ     SN:GL000207.1   LN:4262
@SQ     SN:GL000226.1   LN:15008
@SQ     SN:GL000229.1   LN:19913
@SQ     SN:GL000231.1   LN:27386
@SQ     SN:GL000210.1   LN:27682
@SQ     SN:GL000239.1   LN:33824
@SQ     SN:GL000235.1   LN:34474
@SQ     SN:GL000201.1   LN:36148
@SQ     SN:GL000247.1   LN:36422
@SQ     SN:GL000245.1   LN:36651
@SQ     SN:GL000197.1   LN:37175
@SQ     SN:GL000203.1   LN:37498
@SQ     SN:GL000246.1   LN:38154
@SQ     SN:GL000249.1   LN:38502
@SQ     SN:GL000196.1   LN:38914
@SQ     SN:GL000248.1   LN:39786
@SQ     SN:GL000244.1   LN:39929
@SQ     SN:GL000238.1   LN:39939
@SQ     SN:GL000202.1   LN:40103
@SQ     SN:GL000234.1   LN:40531
@SQ     SN:GL000232.1   LN:40652
@SQ     SN:GL000206.1   LN:41001
@SQ     SN:GL000240.1   LN:41933
@SQ     SN:GL000236.1   LN:41934
@SQ     SN:GL000241.1   LN:42152
@SQ     SN:GL000243.1   LN:43341
@SQ     SN:GL000242.1   LN:43523
@SQ     SN:GL000230.1   LN:43691
@SQ     SN:GL000237.1   LN:45867
@SQ     SN:GL000233.1   LN:45941
@SQ     SN:GL000204.1   LN:81310
@SQ     SN:GL000198.1   LN:90085
@SQ     SN:GL000208.1   LN:92689
@SQ     SN:GL000191.1   LN:106433
@SQ     SN:GL000227.1   LN:128374
@SQ     SN:GL000228.1   LN:129120
@SQ     SN:GL000214.1   LN:137718
@SQ     SN:GL000221.1   LN:155397
@SQ     SN:GL000209.1   LN:159169
@SQ     SN:GL000218.1   LN:161147
@SQ     SN:GL000220.1   LN:161802
@SQ     SN:GL000213.1   LN:164239
@SQ     SN:GL000211.1   LN:166566
@SQ     SN:GL000199.1   LN:169874
@SQ     SN:GL000217.1   LN:172149
@SQ     SN:GL000216.1   LN:172294
@SQ     SN:GL000215.1   LN:172545
@SQ     SN:GL000205.1   LN:174588
@SQ     SN:GL000219.1   LN:179198
@SQ     SN:GL000224.1   LN:179693
@SQ     SN:GL000223.1   LN:180455
@SQ     SN:GL000195.1   LN:182896
@SQ     SN:GL000212.1   LN:186858
@SQ     SN:GL000222.1   LN:186861
@SQ     SN:GL000200.1   LN:187035
@SQ     SN:GL000193.1   LN:189789
@SQ     SN:GL000194.1   LN:191469
@SQ     SN:GL000225.1   LN:211173
@SQ     SN:GL000192.1   LN:547496
@RG     ID:140517_SN1222_0252_BC3KYPACXX_5_CGACTGGA     CN:GCC_02       LB:mylib     PL:Illumina_HiSeq200
@PG     ID:bwa  PN:bwa  CL:/scratch/genomic_med/apps/spseq/speedseq//bin/bwa mem -t 12 -C -p /risapps/reference/bwa-indexed/human_g1
@PG     ID:SAMBLASTER   CL:samblaster -i stdin -o stdout --excludeDups --addMateTags -d my_tmp_realn/disc_pipe

Thanks!

cc2qe commented 7 years ago

Thanks Ming. In the first command, SVTyper cannot find any paired-end, inward facing reads in your BAM file. You should have a look at the SAM flags in your BAM file and ensure that it was aligned properly