tseemann / barrnap

:microscope: :leo: Bacterial ribosomal RNA predictor
GNU General Public License v3.0
228 stars 40 forks source link

archaea and bacteria 16S duplicate #35

Open chloelulu opened 5 years ago

chloelulu commented 5 years ago

Hi, developer, Thanks for creating such efficient software. I have used it to find the 16S rRNA hits in my de-novo assembled genome bins. My purpose is to search for archaea and bacteria, so I run the result separately with -k bac and-k arc. However, the result is so confusing. For example, one of the bin found two 16S hits of archaea and also two hits of bacteria. The header of the hits are >16S_rRNA::NODE_2_length_100533_cov_5.789665:250-1687(-) and >16S_rRNA::NODE_8_length_10807_cov_5.393508:10362-10807(-) in bacteria output. The header of the hits are >16S_rRNA::NODE_2_length_100533_cov_5.789665:251-1678(-) and >16S_rRNA::NODE_8_length_10807_cov_5.393508:10363-10803(-) And I blast both fasta hits to RDP classifier, and the archaea hits outputs are 16S_rRNA::NODE_2_length_100533_cov_5.789665:251-1678(-);+;Bacteria;100%;"Bacteroidetes";98%;"Bacteroidia";96%;"Bacteroidales";96%;"Rikenellaceae";38%;Mucinivorans;33% 16S_rRNA::NODE_8_length_10807_cov_5.393508:10363-10803(-);+;Bacteria;99%;Firmicutes;70%;Clostridia;61%;Clostridiales;61%;Ruminococcaceae;43%;Hydrogenoanaerobacterium;14% Also bacteria hits outputs are 16S_rRNA::NODE_2_length_100533_cov_5.789665:250-1687(-);+;Bacteria;100%;"Bacteroidetes";98%;"Bacteroidia";94%;"Bacteroidales";94%;"Rikenellaceae";34%;Mucinivorans;24% 16S_rRNA::NODE_8_length_10807_cov_5.393508:10362-10807(-);+;Bacteria;99%;Firmicutes;78%;Clostridia;53%;Clostridiales;53%;Ruminococcaceae;40%;Hydrogenoanaerobacterium;14% So my question are - (1) The result of bacteria and archaea are the same, both are bacteria. Why they are classified into two parts, bacteria and archaea? (2) The two hits came from one genome bin, why they can be predicted and have two 16S with different taxonomy classification?

Thanks so much for your patience! Best.

tseemann commented 5 years ago

Yes, both bacteria and archaea share a 16S model from RFAM.

NAME  16S_rRNA
ACC   RF00177

Barrnap is designed for bacterial isolates. It was not designed to predict kingdom of MAGs.

I do not know why And I blast both fasta hits to RDP classifier gives different answers for the same identical (?) sequences.