katholt / srst2

Short Read Sequence Typing for Bacterial Pathogens
Other
125 stars 65 forks source link

Error #78

Closed lskatz closed 7 years ago

lskatz commented 7 years ago

Hi, I am getting an error using SRST2 v0.2.0 but am unclear on the cause. Is the issues page the appropriate way to ask for help on SRST2?

(
  mkdir -p mlst_databases/cjejuni && \
  cd mlst_databases/cjejuni && getmlst.py --species "Campylobacter jejuni" && \
  bowtie2-build Campylobacter_jejuni.fasta Campylobacter_jejuni.fasta && \
  samtools faidx Campylobacter_jejuni.fasta
)
find ../datasets/Campylobacter_jejuni_0810PADBR-1.xlsx -name '*_1.fastq.gz' | \
xargs -P 42 -n 1 bash -c '
  b=$(basename $0 _1.fastq.gz); 
  d=$(dirname $0); 
  out=`srst2 --input_pe $d/$b*.fastq.gz --mlst_db mlst_databases/cjejuni/Campylobacter_jejuni.fasta --mlst_definitions mlst_databases/cjejuni/campylobacter.txt --threads 5 --output campy.out/$b --mlst_delimiter _ 2>&1`; 
  echo "$out";
'

This is the output from one of the entries. Some give this error and some pass. This is the sample behind this error: https://www.ncbi.nlm.nih.gov/sra/?term=PNUSAC000194

03/06/2017 11:32:03 program started
03/06/2017 11:32:03 command line: /scicomp/home/gzu2/.local/bin/srst2 --input_pe ../datasets/Campylobacter_jejuni_0810PADBR-1.xlsx/PNUSA000194_1.fastq.gz ../datasets/Campylobacter_jejuni_0810PADBR-1.xlsx/PNUSA000194_2.fastq.gz --mlst_db mlst_databases/cjejuni/Campylobacter_jejuni.fasta --mlst_definitions mlst_databases/cjejuni/campylobacter.txt --threads 5 --output campy.out/PNUSA000194 --mlst_delimiter _
03/06/2017 11:32:03 Total paired readsets found:1
03/06/2017 11:32:03 Index for mlst_databases/cjejuni/Campylobacter_jejuni.fasta is already built...
03/06/2017 11:32:03 Processing database mlst_databases/cjejuni/Campylobacter_jejuni.fasta
03/06/2017 11:32:05 Processing sample PNUSA000194
03/06/2017 11:32:05 Starting mapping with bowtie2
03/06/2017 11:32:05 Output prefix set to: campy.out/PNUSA000194__PNUSA000194.Campylobacter_jejuni
03/06/2017 11:32:06 Aligning reads to index mlst_databases/cjejuni/Campylobacter_jejuni.fasta using bowtie2...
03/06/2017 11:32:06 Running: bowtie2 -1 ../datasets/Campylobacter_jejuni_0810PADBR-1.xlsx/PNUSA000194_1.fastq.gz -2 ../datasets/Campylobacter_jejuni_0810PADBR-1.xlsx/PNUSA000194_2.fastq.gz -S campy.out/PNUSA000194__PNUSA000194.Campylobacter_jejuni.sam -q --very-sensitive-local --no-unal -a -x mlst_databases/cjejuni/Campylobacter_jejuni.fasta --threads 5
416076 reads; of these:
  416076 (100.00%) were paired; of these:
    414522 (99.63%) aligned concordantly 0 times
    2 (0.00%) aligned concordantly exactly 1 time
    1552 (0.37%) aligned concordantly >1 times
    ----
    414522 pairs aligned concordantly 0 times; of these:
      0 (0.00%) aligned discordantly 1 time
    ----
    414522 pairs aligned 0 times concordantly or discordantly; of these:
      829044 mates make up the pairs; of these:
        828490 (99.93%) aligned 0 times
        0 (0.00%) aligned exactly 1 time
        554 (0.07%) aligned >1 times
0.44% overall alignment rate
03/06/2017 11:50:38 Processing Bowtie2 output with SAMtools...
03/06/2017 11:50:38 Generate and sort BAM file...
03/06/2017 11:50:38 Running: samtools view -@ 5 -b -o campy.out/PNUSA000194__PNUSA000194.Campylobacter_jejuni.unsorted.bam -q 1 -S campy.out/PNUSA000194__PNUSA000194.Campylobacter_jejuni.sam.mod
03/06/2017 11:50:50 Running: samtools sort -@ 5 -o campy.out/PNUSA000194__PNUSA000194.Campylobacter_jejuni.sorted.bam -O bam -T campy.out/PNUSA000194__PNUSA000194.Campylobacter_jejuni.sort_temp campy.out/PNUSA000194__PNUSA000194.Campylobacter_jejuni.unsorted.bam
03/06/2017 11:51:09 Deleting sam and bam files that are not longer needed...
03/06/2017 11:51:09 Deleting campy.out/PNUSA000194__PNUSA000194.Campylobacter_jejuni.sam
03/06/2017 11:51:10 Deleting campy.out/PNUSA000194__PNUSA000194.Campylobacter_jejuni.sam.mod
03/06/2017 11:51:10 Deleting campy.out/PNUSA000194__PNUSA000194.Campylobacter_jejuni.unsorted.bam
03/06/2017 11:51:10 Generate pileup...
03/06/2017 11:51:10 Running: samtools mpileup -L 1000 -f mlst_databases/cjejuni/Campylobacter_jejuni.fasta -Q 20 -q 1 -B campy.out/PNUSA000194__PNUSA000194.Campylobacter_jejuni.sorted.bam
[mpileup] 1 samples in 1 input files
<mpileup> Set max per-file depth to 8000
03/06/2017 11:51:33  Processing SAMtools pileup...
03/06/2017 11:52:29  Scoring alleles...
/scicomp/home/gzu2/.local/lib/python2.7/site-packages/scipy/stats/_discrete_distns.py:57: RuntimeWarning: floating point number truncated to an integer
  vals = special.bdtr(k, n, p)
Attempting to read 7 loci from ST database mlst_databases/cjejuni/campylobacter.txt
Read ST database mlst_databases/cjejuni/campylobacter.txt successfully
Traceback (most recent call last):
  File "/scicomp/home/gzu2/.local/bin/srst2", line 9, in <module>
    load_entry_point('srst2==0.2.0', 'console_scripts', 'srst2')()
  File "/scicomp/home/gzu2/.local/lib/python2.7/site-packages/srst2/srst2.py", line 1717, in main
    mlst_report, mlst_results = run_srst2(args, fileSets, args.mlst_db, "mlst")
  File "/scicomp/home/gzu2/.local/lib/python2.7/site-packages/srst2/srst2.py", line 1264, in run_srst2
    db_results_list, fasta)
  File "/scicomp/home/gzu2/.local/lib/python2.7/site-packages/srst2/srst2.py", line 1327, in process_fasta_db
    results,gene_list, db_report, cluster_symbols, max_mismatch)
  File "/scicomp/home/gzu2/.local/lib/python2.7/site-packages/srst2/srst2.py", line 1429, in map_fileSet_to_db
    size_allele, next_to_del_depth_allele, run_type,unique_gene_symbols, unique_allele_symbols)
  File "/scicomp/home/gzu2/.local/lib/python2.7/site-packages/srst2/srst2.py", line 559, in score_alleles
    slope, _intercept, _r_value, _p_value, _std_err = linregress(exp_pvals2, pvals)
  File "/scicomp/home/gzu2/.local/lib/python2.7/site-packages/scipy/stats/_stats_mstats_common.py", line 72, in linregress
    raise ValueError("Inputs must not be empty.")
ValueError: Inputs must not be empty.

I also always get the warning RuntimeWarning: floating point number truncated to an integer vals = special.bdtr(k, n, p)

lskatz commented 7 years ago

After looking at #74, I updated my PATH to prioritize samtools v0.1.18 and it might have fixed it.