geronimp / enrichM

Toolbox for comparative genomics of MAGs
80 stars 22 forks source link

Nucleotide sequences of genes in 'genome_genes' directory all have identical sequences #96

Open rhysnewell opened 4 years ago

rhysnewell commented 4 years ago

Hey Joel,

Hope you are doing well! Found a funky bug in the nucleotide sequence output from enrichm annotate. Here's an example:

>contig_112_pilon_1 TATTTAGTTAATATGTCATTTATATCTTTTGCATTTAGAGAAGAGTATGAGAAGGTAAAGCTTTTGGGAGACAAATTGAACGAGATTGACTCATTGATCAACTGGGAATCATTTAGACCGATAGTGAAAGATATGTTTGACAACAAAAGTGAAAAGGGTGGACGTCCTAATATCGATGAAGTTGTAATGATCAAAACCCTGATTTTACAGGAGTGGCATGGTCTTTCTGATCCAGAACTTGAGCGACAAATCACCGACAGGATATCCTTCCGCAAGTTTTTAGGTTTTCCTGAAAACATACCTGATTTCACAACAGTCTGGACTTTTCGAGAGCGGTTAAGCAAAAAAGGTAAGGACAAAGAAATCTGGAAAGAATTACAGAGACAGCTTGATTCAAAGGGATTGAAGGTAAAAAAGGGGGTTATACAGGATGCAACATTTATCACATCTGATCCAGGACATGCAAAAGCAGATAAACCAAGAGGTGATGAGGCAAAAACACGAAGAAGTAAAGATGGTACCTGGGTAAAAAAGAACAGTAAGTCATACTTCGGGTATAAGTTTCACTCAAAGGAAGATGTTGATTACGGTCTTATAAGGAAGATCGAGACTACAACGGCATCAGTACACGATAGTCAGATTGATCTCTCTGAACCAGGAGAAGTCGTGTACAAGGATAAAGGATATTTTGGAGCGTCATCAAAAGGATACAGTGCGACTATGAGAAGATCTGTTCGTGGTCATCCGATTGGTATCAAAGATATTCTGCGTAACAAACGAATTAGCAAGAAAAGAGCACCTGGAGAAAGACCCTATGCAGTGATTAAAAATGTATTCAAATCAGGGCATATTATGGTTACAACCGTTGCCAGGGCAGCAGTCAAAACGGTATTTACAGCATTTGGATTCAATCTATATCAACTCTTAACTTTGAAGAAACAAGGAATTGTATAG >contig_112_pilon_2 K20155 TATTTAGTTAATATGTCATTTATATCTTTTGCATTTAGAGAAGAGTATGAGAAGGTAAAGCTTTTGGGAGACAAATTGAACGAGATTGACTCATTGATCAACTGGGAATCATTTAGACCGATAGTGAAAGATATGTTTGACAACAAAAGTGAAAAGGGTGGACGTCCTAATATCGATGAAGTTGTAATGATCAAAACCCTGATTTTACAGGAGTGGCATGGTCTTTCTGATCCAGAACTTGAGCGACAAATCACCGACAGGATATCCTTCCGCAAGTTTTTAGGTTTTCCTGAAAACATACCTGATTTCACAACAGTCTGGACTTTTCGAGAGCGGTTAAGCAAAAAAGGTAAGGACAAAGAAATCTGGAAAGAATTACAGAGACAGCTTGATTCAAAGGGATTGAAGGTAAAAAAGGGGGTTATACAGGATGCAACATTTATCACATCTGATCCAGGACATGCAAAAGCAGATAAACCAAGAGGTGATGAGGCAAAAACACGAAGAAGTAAAGATGGTACCTGGGTAAAAAAGAACAGTAAGTCATACTTCGGGTATAAGTTTCACTCAAAGGAAGATGTTGATTACGGTCTTATAAGGAAGATCGAGACTACAACGGCATCAGTACACGATAGTCAGATTGATCTCTCTGAACCAGGAGAAGTCGTGTACAAGGATAAAGGATATTTTGGAGCGTCATCAAAAGGATACAGTGCGACTATGAGAAGATCTGTTCGTGGTCATCCGATTGGTATCAAAGATATTCTGCGTAACAAACGAATTAGCAAGAAAAGAGCACCTGGAGAAAGACCCTATGCAGTGATTAAAAATGTATTCAAATCAGGGCATATTATGGTTACAACCGTTGCCAGGGCAGCAGTCAAAACGGTATTTACAGCATTTGGATTCAATCTATATCAACTCTTAACTTTGAAGAAACAAGGAATTGTATAG >contig_112_pilon_3 TATTTAGTTAATATGTCATTTATATCTTTTGCATTTAGAGAAGAGTATGAGAAGGTAAAGCTTTTGGGAGACAAATTGAACGAGATTGACTCATTGATCAACTGGGAATCATTTAGACCGATAGTGAAAGATATGTTTGACAACAAAAGTGAAAAGGGTGGACGTCCTAATATCGATGAAGTTGTAATGATCAAAACCCTGATTTTACAGGAGTGGCATGGTCTTTCTGATCCAGAACTTGAGCGACAAATCACCGACAGGATATCCTTCCGCAAGTTTTTAGGTTTTCCTGAAAACATACCTGATTTCACAACAGTCTGGACTTTTCGAGAGCGGTTAAGCAAAAAAGGTAAGGACAAAGAAATCTGGAAAGAATTACAGAGACAGCTTGATTCAAAGGGATTGAAGGTAAAAAAGGGGGTTATACAGGATGCAACATTTATCACATCTGATCCAGGACATGCAAAAGCAGATAAACCAAGAGGTGATGAGGCAAAAACACGAAGAAGTAAAGATGGTACCTGGGTAAAAAAGAACAGTAAGTCATACTTCGGGTATAAGTTTCACTCAAAGGAAGATGTTGATTACGGTCTTATAAGGAAGATCGAGACTACAACGGCATCAGTACACGATAGTCAGATTGATCTCTCTGAACCAGGAGAAGTCGTGTACAAGGATAAAGGATATTTTGGAGCGTCATCAAAAGGATACAGTGCGACTATGAGAAGATCTGTTCGTGGTCATCCGATTGGTATCAAAGATATTCTGCGTAACAAACGAATTAGCAAGAAAAGAGCACCTGGAGAAAGACCCTATGCAGTGATTAAAAATGTATTCAAATCAGGGCATATTATGGTTACAACCGTTGCCAGGGCAGCAGTCAAAACGGTATTTACAGCATTTGGATTCAATCTATATCAACTCTTAACTTTGAAGAAACAAGGAATTGTATAG

As you can see, these are all the same sequence.

Thanks,

Rhys

geronimp commented 4 years ago

Hey Rhys,

Thanks for the bug report! Is this just running the default enrichm enrichment pipeline?

rhysnewell commented 4 years ago

This is running enrichm annotate on two genomes, using ko_hmm and then everything else as default. Also, the release version is 0.5.0rc1

danielkim617 commented 3 years ago

I got the same error