gaarangoa / deeparg

A deep learning based approach to predict Antibiotic Resistance Genes (ARGs) from metagenomes. It provides two models,deepARG-SS and deepARG-LS.
MIT License
24 stars 3 forks source link

Is the ARG-category length correct when normalizing? #4

Closed wangyang1749 closed 9 months ago

wangyang1749 commented 10 months ago

Hi, I found that when ARG-category is normalized here, the length of ARG-category is always the length of the first gene.

try:
    Atype[gtype][0] += int(count)
except:
    Atype[gtype] = [int(count), geneLen]
 Xtype[itype] = (Atype[itype][0]/float(Atype[itype][1])) / \
                (float(N16s)/L16s)

The sorting here is only based on gene names.

cmd = "sort -k1,1 -k2,2n "+inputFile + "  | bedtools merge -c 12,5 -o sum,distinct >"+inputFile+".merged"
gaarangoa commented 9 months ago

The normalization is done that way for simplicity. We just need a reference to normalize to.