tseemann / prokka

:zap: :aquarius: Rapid prokaryotic genome annotation
843 stars 226 forks source link

How to create a local (custom) database with a multifasta DNA sequence file in Prokka? #610

Open Felipedb02 opened 2 years ago

Felipedb02 commented 2 years ago

Hi everyone, I'm trying to create a local (custom) database for prokka based on a multifasta file that contains a set of DNA sequences of different bacterial genes. It looks like this

mecI:1:D86934 TTATTTTTTATTCAATATATTTCTCAATTCTTCTATTTCATCTTGTGATAGATCTTCTTTTTCTACAAAGTTTAAGACAAGTGAATTGAAACCGCCTTTGTATACTTTATTGATAAAGTTTTTAGATGTTTTATATTTTATATCACTTTCTTCTACAAGAGAGTAATATTGAAAAATTTTATTGTCTTTTTTACGATTA mecI:2:AB037671 TTAAAAAATTTTATTGTCTTTTTTACGATCTATAAATCCCTTTTTATACAATCTCGTTATAAGTGTACGAATGGTTTTTGGACTCCAGTCCTTTTGCATTTGTATTTCTTCTATTATATTATTCGCACTTGCATATTTTTCATCCAAATGATATTCATAACTTCCCATTCTGCAGATGATATTTCATACGTTTTATTATCCAT mecI:3:FJ670542 TTATTTTTTATTCAATATATTTCTCAATTCTTCTATTTCATCTTGTGATAGATCTTCTTTTTCTACAAAGTTTAAGACAAGTGAATTGAAACCGCCTTTGTATACTTTATTGATAAAGTTTTTAGATGTTTTATATTTTATATCACTTTCTTCTACAAGAGAGTAATATTGAAAAATTTTATTGTCTTTTTTACGATC mecI:4:FJ390057 ATGGATAATAAAACGTATGAAATATCATCTGCAGAATGGGAAGTTATGAATATCATTTGGATGAAAAAATATGCAAGTGCGAATAATATAATAGAAGAAATACAAATGCAAAAGGACTGGAGTCCAAAAACCATTCGTACACTTATAACGAGATTGTATAAAAAGGGATTTATAGATCGTAAAAAAGACAAT........

I´ve been trying to do it converting first, the inicial DNA multifasta file in to a PROTEIN multifasta file by using EMBOSS - transeq tool, then creating a blast database and indexing it in to the PROKKA database directory, and finally do the typing process of this custom database against a bacterial genome in order to get the gbk output of this process. However, I ask you for help if there is an easier way to do this because I'm loosing some sequence information caused by the EMBOSS-transeq process. May be if there is a way to do the setting the DNA database instead of the protein database.

Thank you guys.