Open Jeepee8820 opened 4 years ago
Bacterial genomes should not have mRNA features anywyay; they only use gene and CDS.
Don't use the --mrna
switch.
The chance of a prokka annotated genome being accepted by ENA is close to 0%. They became very strict a few years back. The preferred process is to use PGAP and submit to NCBI, or just submit contigs to NCBI and tick :heavy_check_mark: to let them annotate with PGAP.
But you are right that those RNA features should not get a mRNA for them when using --mrna. This is a bug.
Hello,
First I do not agree, prokaryotes have mRNA. If the mRNA feature is not present, processing the data with agat_sp_fix_features_locations_duplicated.pl
to remove the duplicated locations will add the mRNA features.
Secondly EMBLmyGFF3 is perfectly suitable to convert prokka annotation into EMBL file that is the submittable file to ENA.
I second Juke34 here for two reasons:
Tl;dr not including mRNAs is bad for file format standardization across kingdoms and biological information delineation.
Hi tseemann,
Thanks for the great tool. I am experiencing an issue with validating files for submission to ENA after conversion with EMBLmyGFF3 which seems to be related to an issue in the output generated by Prokka. Indeed, the misc_RNA or tRNA or RNA records (perhaps others too) are also being assigned a mRNA at the exact same location which generate some duplicated features that the ENA validation tool is complaining about. Please see the last posts in this thread for more details https://github.com/NBISweden/EMBLmyGFF3/issues/33 Do you see any possibility to fix this issue? Thanks in advance