Closed Wanli-HE closed 2 years ago
Hello, the sequences are located in the rep.dna.fas
file that is located inside your database directory where the mob-suite is installed (for example, /usr/local/lib/python3.9/dist-packages/mob_suite-3.0.3-py3.9.egg/mob_suite/databases
).
You can also get databases by downloading the archive from https://zenodo.org/record/3786915/files/data.tar.gz?download=1
For example the sequence for Col(BS512)
and rep_cluster_816
>NC_010656|Col(BS512)
ATGAATGCGGCGTTTAAGCGAATGGAAAAGCGAAAGGAGCTATCACCTGTTCAGGGGTGGATCAGGGCTACGGAGGTGACGCGAGGTAAGGATGGCAGCGCACATCCGCATTTTCACTGTCTGCTGATGGTGCAACCTTCTTGGTTTAAAGGGAAGAACTACGTTAAGCACGAACGTTGGGTAGAACTCTGGCGCGATTGCTTGCGGGTGAACTATGAGCCGAATATCGATAT
>002299__NC_021722_00001|rep_cluster_816
ATGATGACACATTCAAAGCACAAATTCACTTTTATTGAAAAATCTTCTGCGTATCAAAAAAAATACTTCCAATTTCCACAAGTTTTGCTATACGGAGAAAAATATAAGTCCCTTAGCGATAGTGCCAAAATTGCCTATATGGTTCTTCAAAGCAGGCTCGACTACTCGTTAAAAAACAATTGGATTGATGAATCAAATCATGTGTATTTCATTTTTACAAACCAAGAGCTGAAATCGCTAATGCATTGGTCAAACGATAAACTTCGTAAGGTTAAATCAGATCTCATAAATGCAAATTTACTGTATCAAGAAGTAGTCGGGTTTAATCCTAAAACGGGAAAAAATGAGCCAAATCGGCTATATTTATCCGAACTGGATGTTAGTGCAACTGATGTTTATCTCAAGGCTTTTGAACCTAATGAAGACGTAAAAACCCATACACAGTACGGGAAACCGAAAATCGGTCGCCCGCAAGAGACCGTTCAAACTACCGAAAACAGCGGGAAACCGAAAATCGGTCGCCCGCGACATAAGAACTCAAGTGAAGCCGGACCCCTTGAAAATAGCGGGAAACCGAAAATCGGTCACGATCTATATAAGACTTTAGATACAAATACTAGAGACAATAAAGAGACAGAAAAACTGGACTTTTCCACAAATCGATATTCACCTGAGATCATTAAAAAGCAAAATCAAGATCTCGTAAAAAATGCCAGAAACTATCTGCCTGAATCAACAACAGGTGGCCTCTTTCTCAACAAAGAAGGCGTTGAACTGCTAGGCCTTTGGTGCCGCTCACCTAAACAATTGCATCGGTTCCTCGGCATTATCCTAAATGCCAAAAAGGCTGTAGAAAGGGAACATGAAGGAACGGCGATTGTACTTGACGATCCGCTATGCCAAGAAATGATAAACAAGACCATGCGCCGTTTTTTCAATATTCTGCGCTCTGACAGTAAAAAAATTAACAATGTTGAAAATTACTTGTTTGGTGCTATGAAAGAAACATTGGTGGCATACTGGAATAAGACACTGACAACTGCTAACAGAGGTGATCCTAATGAGCTCTAA
Hi!
here is the rep_type annotated from mob_typer, the id like in below: 'Col(BS512)', 'Col(KPHS6)', 'Col(MG828)', 'Col(MG828),rep_cluster_2392', 'Col(MP18)', 'Col(VCM04)', 'Col(Ye4449)', 'Col156', 'Col156,Col156', 'Col156,IncFIB', ... 'rep_cluster_816', 'rep_cluster_850', 'rep_cluster_870', 'rep_cluster_889', 'rep_cluster_893', 'rep_cluster_910', 'rep_cluster_943', 'rep_cluster_974', 'rep_cluster_980', 'rep_cluster_992'
I am wondering how can I get the fasta file of each rep_cluster, which means each rep_cluster sequence.
Thanks!
best, wanli