jiwoongbio / FMAP

Functional Mapping and Analysis Pipeline for metagenomics and metatranscriptomics studies
Other
26 stars 13 forks source link

Abundances #3

Open davidvilanova opened 7 years ago

davidvilanova commented 7 years ago

Hi, When computing abundances i see a strange behaviour (see exemple below) Each file processed individually and then grouped.

I would expect the sum of the two files should be 11+7=18 genes and it returns 17 genes ?? Not sure if there is an extra filter added beyond the 80% ident.

perl FMAP_quantification.pl 36.uniprot.blast.txt  | grep K00001
K00001  E1.1.1.1, adh; alcohol dehydrogenase [EC:1.1.1.1]       7     997.386527521675

 perl FMAP_quantification.pl 35.uniprot.blast.txt  | grep K00001
K00001  E1.1.1.1, adh; alcohol dehydrogenase [EC:1.1.1.1]       11      416.346777952545

perl FMAP_quantification.pl 35.uniprot.blast.txt  36.uniprot.blast.txt | grep K00001
K00001  E1.1.1.1, adh; alcohol dehydrogenase [EC:1.1.1.1]       17      537.487924074303
jiwoongbio commented 7 years ago

The RPKM calculation of paired-end reads is FPKM that counts read names. I think there is K00001-mapped reads with an identical name in both 35.uniprot.blast.txt and 36.uniprot.blast.txt. The recent update on FMAP includes the fix of this problem, so it distinguishes reads from separate input files. Please try with the latest version.

davidvilanova commented 7 years ago

Ok great. I have updated. Fixed !!!

zhanxw commented 7 years ago

Thanks @jiwoongbio