wyang17 / SQuIRE

Software for Quantifying Interspersed Repeat Expression
Other
49 stars 29 forks source link

create_subfamily_dict count assignment #20

Open yavorska opened 5 years ago

yavorska commented 5 years ago

I'm not an expert in Python so forgive me if I'm wrong, but I've tried to decompose the Call.py function to understand how the counttables used in the DESeq analysis are generated.

Within the create_subfamily_dict function, the count assigned into the dictionary is the uniq_counts rather than the total counts used elsewhere:

def create_subfamily_dict(infilepath,count_dict): TE_classes=["LTR","LINE","SINE","Retroposon","DNA","RC"] with open(infilepath,'r') as infile: for line in infile: line = line.rstrip() line = line.split("\t") taxo = line[2] count=line[6]
if any(x in taxo for x in TE_classes):
if count=="tot_counts": continue else: count = str(int(round(float(line[5])))) sample = line[0] if taxo not in count_dict: count_dict[taxo] = {sample:count} else:
count_dict[taxo][sample]=count

Can you explain why this is the case?