Closed phingchian closed 2 months ago
Just to add, the *.depth.txt file is generated following the below example in the user guide:
seqkit fx2tab -l -n -i ../prokka_Wet2014/Wet2014.ffn | awk '{print $1"\t"$2}' > Wet2014.length
seqkit fx2tab -l -n -i ../prokka_Wet2014/Wet2014.ffn | awk '{print $1"\t"0"\t"$2}' > Wet2014.bed
bedtools coverage -g Wet2014.length -sorted -a Wet2014.bed -counts -b ../samfiles/Wet2014.CDS.bam > Wet2014.depth.txt
def ReadBedtoos(filename): lines = open(filename).readlines() seqid2info = {line.split()[0]:bedtools_read_count(line.split()) for line in lines[1:]} normalized_tpm = 0. for seqid in seqid2info: seqid_depth = seqid2info[seqid] normalized_tpm += seqid_depth.read_count/seqid_depth.length return seqid2info,normalized_tpm
The *.depth.txt generated has no header - slicing lines[1:] will omit the first entry. Suppose the first entry is a valid CAZyme, it will fail the check at get_length_readcount() and exit prematurely. Kindly check.
Thank you for identifying this bug. Regardless of whether the first line is a CAZyme or not, it should be counted in the total count of mapped reads.
The *.depth.txt generated has no header - slicing lines[1:] will omit the first entry. Suppose the first entry is a valid CAZyme, it will fail the check at get_length_readcount() and exit prematurely. Kindly check.