tseemann / barrnap

:microscope: :leo: Bacterial ribosomal RNA predictor
GNU General Public License v3.0
210 stars 41 forks source link

Overlapping 5.8S and 28S annotations in Eukaryotic rDNAs #47

Open BartvdVossenberg opened 3 years ago

BartvdVossenberg commented 3 years ago

Dear Torsten Seemann,

I am an enthusiastic user of the barrnap tool. I work with both plant parasitic prokaryotes and eukaryotes. I noticed that when annotating eukaryotic rDNA sequences, the 18S and 5.8S predictions are accurate, but the 28S gene is consequently predicted to start just before the 5.8S gene (see image attached). As a result, the ITS2 and correct start of the 28S gene have to be determined manually. I can imagine the prediction if difficult as the start of 5.8S and 28S have high sequence similarity and share some key conserved sequences.

Is there some way to overcome this inconvenience?

Hope to hear from you. Best wishes, Bart prediction 28S

ramiroricardo commented 1 year ago

Hi all,

we are having a similar issue. Did you ever come up with a solution to this?

Thanks,

Ramiro

BartvdVossenberg commented 1 year ago

No, unfortunately we have not.

Best wishes Bart

Op ma 19 jun. 2023 18:59 schreef ramiroricardo @.***>:

Hi all,

we are having a similar issue. Did you ever come up with a solution to this?

Thanks,

Ramiro

— Reply to this email directly, view it on GitHub https://github.com/tseemann/barrnap/issues/47#issuecomment-1597493083, or unsubscribe https://github.com/notifications/unsubscribe-auth/AN5KIRIH2YLWIUTPT4WB7HLXMCAO3ANCNFSM4TYNOIOA . You are receiving this because you authored the thread.Message ID: @.***>

gbdias commented 4 months ago

Hi @BartvdVossenberg , I was curious how do you determine the correct start of the 28S as well as the correct end of 5.8S.

moshi4 commented 4 months ago

I recently developed pybarrnap, which reimplements the equivalent functionality of barrnap in python and updates the Rfam model profiles to the latest version. In pybarrnap, no overlap occurs between eukaryotic 5.8S and 28S, so I think the old model profile in barrnap is the reason why it cannot accurately distinguish between 5.8S and 28S.

Example (barrnap vs pybarrnap)

Saccharomyces_cerevisiae.fna.gz

$ barrnap Saccharomyces_cerevisiae.fna -k euk -q # Overlap 5.8S and 28S
##gff-version 3
NC_001144.5 barrnap:0.9 rRNA    451796  455574  0   -   .   Name=28S_rRNA;product=28S ribosomal RNA
NC_001144.5 barrnap:0.9 rRNA    455418  455569  1.6e-36 -   .   Name=5_8S_rRNA;product=5.8S ribosomal RNA
NC_001144.5 barrnap:0.9 rRNA    455934  457731  0   -   .   Name=18S_rRNA;product=18S ribosomal RNA
NC_001144.5 barrnap:0.9 rRNA    459678  459795  3.1e-13 +   .   Name=5S_rRNA;product=5S ribosomal RNA
NC_001144.5 barrnap:0.9 rRNA    460933  464711  0   -   .   Name=28S_rRNA;product=28S ribosomal RNA
NC_001144.5 barrnap:0.9 rRNA    464555  464706  1.6e-36 -   .   Name=5_8S_rRNA;product=5.8S ribosomal RNA
NC_001144.5 barrnap:0.9 rRNA    465071  466868  0   -   .   Name=18S_rRNA;product=18S ribosomal RNA
NC_001144.5 barrnap:0.9 rRNA    468815  468930  5.4e-13 +   .   Name=5S_rRNA;product=5S ribosomal RNA
NC_001144.5 barrnap:0.9 rRNA    472467  472582  5.4e-13 +   .   Name=5S_rRNA;product=5S ribosomal RNA
NC_001144.5 barrnap:0.9 rRNA    482047  482162  5.4e-13 +   .   Name=5S_rRNA;product=5S ribosomal RNA
NC_001144.5 barrnap:0.9 rRNA    485699  485814  5.4e-13 +   .   Name=5S_rRNA;product=5S ribosomal RNA
NC_001144.5 barrnap:0.9 rRNA    489351  489468  3.1e-13 +   .   Name=5S_rRNA;product=5S ribosomal RNA
NC_001144.5 barrnap:0.9 rRNA    489766  490546  5.2e-189    -   .   Name=28S_rRNA;product=28S ribosomal RNA (partial);note=aligned only 26 percent of the 28S ribosomal RNA

$ pybarrnap Saccharomyces_cerevisiae.fna -k euk -q # No overlap 5.8S and 28S
##gff-version 3
NC_001144.5 pybarrnap:0.5.0 rRNA    451787  455177  0   -   .   Name=28S_rRNA;product=28S ribosomal RNA;Dbxref=RFAM:RF02543
NC_001144.5 pybarrnap:0.5.0 rRNA    455418  455570  8.0e-45 -   .   Name=5_8S_rRNA;product=5.8S ribosomal RNA;Dbxref=RFAM:RF00002
NC_001144.5 pybarrnap:0.5.0 rRNA    455934  457731  0   -   .   Name=18S_rRNA;product=18S ribosomal RNA;Dbxref=RFAM:RF01960
NC_001144.5 pybarrnap:0.5.0 rRNA    459678  459794  5.5e-16 +   .   Name=5S_rRNA;product=5S ribosomal RNA;Dbxref=RFAM:RF00001
NC_001144.5 pybarrnap:0.5.0 rRNA    460924  464314  0   -   .   Name=28S_rRNA;product=28S ribosomal RNA;Dbxref=RFAM:RF02543
NC_001144.5 pybarrnap:0.5.0 rRNA    464555  464707  8.0e-45 -   .   Name=5_8S_rRNA;product=5.8S ribosomal RNA;Dbxref=RFAM:RF00002
NC_001144.5 pybarrnap:0.5.0 rRNA    465071  466868  0   -   .   Name=18S_rRNA;product=18S ribosomal RNA;Dbxref=RFAM:RF01960
NC_001144.5 pybarrnap:0.5.0 rRNA    468815  468930  9.5e-16 +   .   Name=5S_rRNA;product=5S ribosomal RNA;Dbxref=RFAM:RF00001
NC_001144.5 pybarrnap:0.5.0 rRNA    472467  472582  9.5e-16 +   .   Name=5S_rRNA;product=5S ribosomal RNA;Dbxref=RFAM:RF00001
NC_001144.5 pybarrnap:0.5.0 rRNA    482047  482162  9.5e-16 +   .   Name=5S_rRNA;product=5S ribosomal RNA;Dbxref=RFAM:RF00001
NC_001144.5 pybarrnap:0.5.0 rRNA    485699  485814  9.5e-16 +   .   Name=5S_rRNA;product=5S ribosomal RNA;Dbxref=RFAM:RF00001
NC_001144.5 pybarrnap:0.5.0 rRNA    489351  489467  5.5e-16 +   .   Name=5S_rRNA;product=5S ribosomal RNA;Dbxref=RFAM:RF00001
NC_001144.5 pybarrnap:0.5.0 rRNA    489755  490547  1.7e-240    -   .   Name=28S_rRNA;product=28S ribosomal RNA (partial);Dbxref=RFAM:RF02543;Note=aligned only 27.23 percent of the 28S ribosomal RNA