Open Rikkiff opened 5 months ago
Hi,
The replication protein id quoted (002374__CP004064_00076
) is found in annotated genome at https://www.ncbi.nlm.nih.gov/nuccore/CP004064.1?report=genbank at 61201..62241 range with protein_id
accession AGE31365.1 (putative replication protein repa)
Thanks Kirill! The nucleotide has three replication proteins, so how did you deduce which one is 002374__CP004064_00076?
Hi, you can do a pairwise alignment with BLASTn using even a webgui or command line. I run BLASTn between the CP004064
annotated genome and the 002374__CP004064_00076
sequence
>002374__CP004064_00076|rep_cluster_893
ATGACTGATTTTAAATTTTTTAAGGCGGATAGAGTTTACAACGAATTATTTTATCAATTTCCAAAAGTCT
TTATTGTTTCTGACGAATACAAAAAAATGAAAGATTCAACTAAGATTGCCTATATGCTTTTAAAAGCAAG
ATTAGAGATCGCAATCAGCAAACGGCAAATCGATGAAGAAGGTAATGTTTATTTTACTTATACGACAAAT
GAACTATGTAGAGTATTAAACTGCCAAAAACAAAAAGCGATAGCAATCAAAAAAGAGTTGGAATCCTTTG
GTTTATTATTACAAAAGCAGATGGGATTTAACAAACAGTTAGGGAAAAATAATCCTAATAGACTATATCT
AGCAGAATTAAAAGTCTCAGAAAATGATATCTACTTACTCGAAAAATTTGATAGAGAGAATAGGGAAAAC
GTTGATAAATCAGAGGGTATGAAAATCATACCCACCCTCGACGAAAAATCAGACGCTGAATCCCTTGGGG
CTCAAGAGGGTATGAAAATCATACCGTGCCAAAACGTTGATAAATCAGAGGGTATGAAAATCATACCAGA
ACTTAATAATAATATATTAGACACTAATAGACACAATATAGACACTGAAAAAGACCGCCTACAAGATCAA
TTGTTGTTAGACAATTTTGAGACAATTATGACAAACGACAGCATTGCTACGTTTGTCCCTGAACGAGTAT
TAAATTTGATAAAAACATTTTCTTCAAGTTACAGTGAAGCTCAAAAAACCGTCCAGACTATTCATAATGC
AAAGAAAAAAGCTGAAATAGAAAGTGGTATTTCGATAGTTTTTGAAGAACTCGATAGTTATTATGTCAAT
GCAGAACAAGAATTATACACGACACTGTTAAAAGCCTATCAAAAATTAAAAACCGAAAAAGTCGAAAATA
TCCAGAACCTGATTTTTGTCTATGTAAAAAATTGGTTTATCGAAAAACCAATAGCTGCTAAAGTATCAAG
TGAAAAACGTTTGAATTATGAAAGCTCCCCAAGCACTATTACGAAAGACTGGTTAGAGTGA
The BLASTn will let you know the alignment range which is in this case is 61201 to 62241. Then look at the https://www.ncbi.nlm.nih.gov/nuccore/CP004064.1?report=genbank annotated genome and find entry that best matches that identified range which is in this case is a protein with accession AGE31365.1.
61201..62241
/locus_tag="M7W_65"
61201..62241
/locus_tag="M7W_65"
/codon_start=1
/transl_table=11
/product="putative replication protein repa"
/protein_id="[AGE31365.1](https://www.ncbi.nlm.nih.gov/protein/445194258)"
/translation="MTDFKFFKADRVYNELFYQFPKVFIVSDEYKKMKDSTKIAYMLL
KARLEIAISKRQIDEEGNVYFTYTTNELCRVLNCQKQKAIAIKKELESFGLLLQKQMG
FNKQLGKNNPNRLYLAELKVSENDIYLLEKFDRENRENVDKSEGMKIIPTLDEKSDAE
SLGAQEGMKIIPCQNVDKSEGMKIIPELNNNILDTNRHNIDTEKDRLQDQLLLDNFET
IMTNDSIATFVPERVLNLIKTFSSSYSEAQKTVQTIHNAKKKAEIESGISIVFEELDS
YYVNAEQELYTTLLKAYQKLKTEKVENIQNLIFVYVKNWFIEKPIAAKVSSEKRLNYE
SSPSTITKDWLE"
Thanks again Kirill!
It would be a lot easier if the accession AGE31365.1 was given directly instead of 002374__CP004064_00076. Also, where did you find the sequence of 002374__CP004064_00076?
I am trying to find the accessions to Rep, MOB and MPF genes identified by mob-typer in NCBI. An example would be an identified Rep protein: 002374__CP004064_00076. When I search for this accession in NCBI, I do not get any results.