Input: Streptococcus pyogenes reads (a negative control in our re-validation dataset).
Error:
Error executing process > 'results'
Caused by:
Process `results` terminated with an error exit status (1)
Command executed:
#!/usr/bin/env python3
import os, sys
import glob, csv
class result_values:
def __init__(self,id):
self.id = id
self.comments = []
self.coverage = "NotRun"
self.quality = "NotRun"
self.pass_cov_quality = False
self.percent_strep = "NotRun"
self.percent_spn = "NotRun"
self.secondgenus = "NotRun"
self.percent_secondgenus = "NotRun"
self.pass_kraken = False
self.pred = "NotRun"
#get list of result files
cg_result_list = glob.glob("*_qual.tsv")
kraken_list = glob.glob("*_kraken.txt")
seroba_list = glob.glob("*_pred.tsv")
results = {}
#collect all cg_pipeline results
for file in cg_result_list:
id = file.split("_qual")[0]
result = result_values(id)
with open(file,'r') as csvfile:
dialect = csv.Sniffer().sniff(csvfile.read(1024))
csvfile.seek(0)
reader = csv.reader(csvfile,dialect)
for row in reader:
if "File" not in row:
result.coverage = float(row[8])
result.quality = float(row[5])
if result.coverage > 70.0 and result.quality > 30.0:
result.pass_cov_quality = True
if result.coverage <= 70.0:
result.comments.append("Lower than 70x coverage")
if result.quality <= 30.0:
result.comments.append("Lower than 30.0 average base quality")
results[id] = result
#collect all kraken results
for file in kraken_list:
id = file.split("_kraken")[0]
result = results[id]
with open(file,'r') as csvfile:
dialect = csv.Sniffer().sniff(csvfile.read(1024))
csvfile.seek(0)
reader = csv.reader(csvfile,dialect)
secondgenus = ""
percent_secondgenus = 0.0
for row in reader:
if row[4] == "1301":
result.percent_strep = float(row[0])
continue
if row[4] == "1313":
result.percent_spn = float(row[0])
continue
if row[3] == "G" and float(row[0]) > percent_secondgenus:
secondgenus = row[5].strip()
percent_secondgenus = float(row[0])
result.secondgenus = secondgenus
result.percent_secondgenus = percent_secondgenus
if result.percent_strep >= 82.0 and result.percent_spn >= 62.0 and result.percent_secondgenus < 1.0:
result.pass_kraken = True
if result.percent_strep < 82.0:
result.comments.append("Less than 82.0% of reads are Strep")
if result.percent_spn < 62.0:
result.comments.append("Less than 62.0% of reads are SPN")
if result.percent_secondgenus >= 1.0:
result.comments.append("More than 1.0% of reads are from "+secondgenus)
results[id] = result
#collect all seroba results
for file in seroba_list:
id = file.split("_pred")[0]
result = results[id]
with open(file,'r') as csvfile:
dialect = csv.Sniffer().sniff(csvfile.read(1024))
csvfile.seek(0)
reader = csv.reader(csvfile,dialect)
types = []
for row in reader:
types.append(row[1])
try:
result.comments.append(row[2])
except IndexError:
pass
result.pred = " ".join(types)
results[id] = result
#create output file
with open("SPNtypeID_results.csv",'w') as csvout:
writer = csv.writer(csvout,delimiter=',')
writer.writerow(["ID","Coverage","Avg Quality","Pass Cov/Qual","Percent Strep","Percent SPN","SecondGenus","Percent SecondGenus","Pass Kraken","Serotype","Comments"])
for id in results:
result = results[id]
comments = "; ".join(result.comments)
writer.writerow([result.id,result.coverage,result.quality,result.pass_cov_quality,result.percent_strep,result.percent_spn,result.secondgenus,result.percent_secondgenus,result.pass_kraken,result.pred,comments])
Command exit status:
1
Command output:
(empty)
Command error:
Traceback (most recent call last):
File ".command.sh", line 69, in <module>
if result.percent_strep >= 82.0 and result.percent_spn >= 62.0 and result.percent_secondgenus < 1.0:
TypeError: '>=' not supported between instances of 'str' and 'float'
Work dir:
/panfs/roc/groups/7/mdh/shared/software_modules/spntypeid/beta_201113/test2/work/e4/b6aeb5716f66b6cc7d9c81d9818c97
Tip: when you have fixed the problem you can continue the execution adding the option `-resume` to the run command line
Cause:
For this sample there are no reads assigned to the SPN taxid (1313) so result.percent_spn remains set to NotRun, this causes a type error when it is compared to 62.0.
You can see a possible solution on my fork, basically just setting all the numeric values to 0.0 instead of NotRun, which so far has worked to fix this bug.
Input: Streptococcus pyogenes reads (a negative control in our re-validation dataset).
Error:
Cause: For this sample there are no reads assigned to the SPN taxid (1313) so result.percent_spn remains set to
NotRun
, this causes a type error when it is compared to 62.0.https://github.com/k-florek/SPNtypeID/blob/ec7d93219a8b11fefeff84eb73ff92cec8dc3a40/SPNtypeID.nf#L192
You can see a possible solution on my fork, basically just setting all the numeric values to
0.0
instead ofNotRun
, which so far has worked to fix this bug.https://github.com/garfinjm/SPNtypeID/blob/b4117cef54897bffa4445dddef7778c7e698aba5/SPNtypeID.nf#L132