ndierckx / NOVOPlasty

NOVOPlasty - The organelle assembler and heteroplasmy caller
Other
170 stars 62 forks source link

The genome has been circularized, but 2 different contigs #224

Open LiviaT93 opened 5 months ago

LiviaT93 commented 5 months ago

I am assembling turtles mitogenomes and in some cases the output I obtain is a "Circularized genome", actually made up of 2 non overlapping contigs: in all these cases, the interruption is in correspondence of the Val tRNA. In the end, I managed to circularized all my samples in one single contig by heightening the kmers and the genome range, but I am willing to understand why this interruption in assembly occurs and whether this might be indicative of some noticeable feature of the mitogenome in that point. Do you have any suggestion?

ndierckx commented 5 months ago

Could you send me your results because it is hard to tell otherwise?

LiviaT93 commented 5 months ago

Sure, here is an example of what I meant in my question. Thank you so much

Il giorno gio 29 feb 2024 alle ore 03:47 Nicolas Dierckxsens < @.***> ha scritto:

Could you send me your results because it is hard to tell otherwise?

— Reply to this email directly, view it on GitHub https://github.com/ndierckx/NOVOPlasty/issues/224#issuecomment-1970300539, or unsubscribe https://github.com/notifications/unsubscribe-auth/BGLUAPZJOIWNVWJMUNKEBXDYV2LE5AVCNFSM6AAAAABDVKGGZSVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTSNZQGMYDANJTHE . You are receiving this because you authored the thread.Message ID: @.***>

9324992 ATATATACAATGTTATTGTAGCTTATCACAAAGCACGGCACTGAAGATGCCAAGATGGGTAAACACACACCCCAAAAACACAAAGATTTGGTCCTAACCTTACTGTTAATTTTTGCTAAACTTATACATGCAAGTATCCGCATACCAGTGAAAACGCCCTAACATATCTAATCAGATAAAAGGAGCCGGTATCAGGCACACCATGACAGCCCAAGACACCTAGCTCTGCCACACCCCCAAGGGTATACTCAGCAGTAATAAAAATTAAGCAATAAGCACAAGCTTGACTTAGTTATAGCAAAGAGAGCTGGTCAACCTCGTGCCAGCCACCGCGGTTATACAAGAAGCCCAAACTAACAACCAATCGGCGTAAAATGTGGCTAAACAGCTCTATCAATAAAATTAAGATGAACCAAGTACCAAACTGTCATACGTAAAAGTACGGATTAACACACTATGAAAATAACCTTAATACAATGAAATAATTTGAACCCACGATCGCTAAGACACAAACTAGGATTAGATACCCTACTATGCTTAGCCCTAAACTTAGATATTTTCCACACAAAAATATCCGCCAGAGAACTACGAGCATAAACGCTTAAAACTCTAAGGACTTGGCGGTACCTCAAACCCTCCTAGAGGAGCCTGTTCTATAATCGATAATCCACGATCTACCTCACCATCCCTTGCCAAACCCGCCTATATACCACCGTCACCAGCTTACCCCATGAGGGCCACAAAAAGTAAGCAAAATAACCTAAACAGTTAATAAGTCAGGTCAAGGTGTAGCTAACTGAGATGGAAGAAATGGGCTACATTTTCTACTTTAGAAATAACCACGGAAGAATACCATGAAATAGGTTCCACAAGCAGGATTTAGCAGTAAACTGGGAACAGAGAGCCCAATTTAAGCCGGTCCTGAGGTACGCACACACCGCCCGTCACCCTCCTCAAACAATCTAACAACCATAAATTAAACCACAACAAACATAATAGATGAGGTAAGTCGTAACAAGGTAAGTACACCGGAAGGTGTACTTGGAACATCAAAACATAGCTTAATCAAAAGCATTCAGCTTACACCTGAAAGACGTCCATTAAACCGGACTATTTTGAGCAAAAATCTAGCCCAACCAACAAATATAAATTCAACAGAAAAAATAATCCTACCAACACCAAACTAAAACATTTTTTCCATCTCAGTATTAGGCGATAGAAAAGACTGTTGGAGCAATAAAGACAGTACCGCAAGGGAAAGATGAAAAACATGAAAGATCTGACTAAGCCCTAAAAAGCAAAGATTAACTCTTATACCTCTTGCATCATGATTTAACTAGTACACCCAAGCAAAGCGAACTAAAGTCTGAACCCCCGAAACCAAGTGAGCTACTTAAGGGCAGCCAACATCACCAGGCTTAAATCCGTCTCTGTGGCAAAAGAGTGGAAAGACCTATAAGTAGAGGTGAAAAGCCTAACGCACTTGGTGATAGCTGGTTGCTCAATAAAAGAATATAAGTTCAACCTTAAATATCCTAAAAACAATATAAAGTATTTAAGAGAAATTTAAGATATATTCAATTGAGGTACAGCTCAATTGAAAAAGGACACAACCTAAAACGGAGGACAAAACATTTAAACATATTACCGTAGGCTTCAAAGCAGCCACCACTAAGGACAGCGTCAAAGCTCATCAACAACAGATATCAACACCAATTTTTTCCCCAAACAATATTGAGCTATTCTACCTAAATAGAAGAACTAATGTTAAAATAAGTAACAAGAAGACAAAACTTCTCTAACGCGCTAGCTTAAATCATAACAGATAAACTAATGATTATTAACAACTAATATTATAAAATCAACAATACTAAACATACCATATAAACTAAACTGTTAACCCAACACAGGAGCGCACACAAGAAAGATTAAAATTTGTAAAAGGAACTAGGCAAACAACTGAGCCCGACTGTTTACCAAAAACATAGCCTCTAGCAGCAACAAGTATTAGAGGTAATGCCTGCCCAGTGACACTGTTAAACGGCCGCGGTATCCTAACCGTGCAAAGGTAGCGTAATCACTTGTCTTTTAAATAAAGACTAGAATGAATGGCCAAACGAGGTTCCACCTGTCTCTTACAAACAATCAGTGAAATTGGTCTTCCCGTGCAAAAGCGGGAATAACACTATAAGACGAGAAGACCCTGTGGAACTTTAAATATAAATCAACTATTATATTTACCACCCTAAAGACTTATAATTAACTAGTTCTGATCCATATTTTTGGTTGGGGTGACCTCGGAGTAAAACAAAACCTCCGAAAAAAGAACATATTTTCTTAACCTAGATTTACAACTCAAAGTGCCAACGGCAAAATGATCCAATATATTTGATCAACGAACCAAGCTACCCCAGGGATAACAGCGCAATCCCATCCTAGAGTTCCTATCGACGATGGGGTTTACGACCTCGATGTTGGATCAGGACATCCTGATGGTGCAACCGCTATCAAGGGTTCGTTTGTTCAACGATTAATAGTCCTACGTGATCTGAGTTCAGACCGGAGTAATCCAGGTCGGTTTCTATCTATAAAATTTGAGCTTTTTCCAGTACGAAAGGACCGAAAAACCAAGGCCCATATTAAAAACAAGCCTTACCTTATATTAATGAAACCAACTTAACTAATAATAAGGACAAAACCATACATCACAAGCCCAAGAAAAGGGCAAAGTTCGGGTGGCAGAGCCAGGTAAAAATGCAGAAGGCCTAAACCCTTTATTCAGGGGTTCAACTCCCCTCCCAAACTATGAAAACCCTACTATCCAACCTAATATCACCACTCATATATATAGTCCCAATTCTAATTGCCGTTGCCTTCTTCACCTTAATTGAACGAAAAATCCTAGGATATATACAACTCCGAAAAGGCCCAAACATCGTTGGCCCCTATGGACTACTACAACCAGTAGCAGACGGCGTAAAACTATTTATTAAAGAACCAATTTACCCATCAAATTCATCAATCATATTATTCACAATATCCCCAATACTAGCCCTGCTACTAGCCCTGTCAATCTGACTCCCACTACCACTACCCTTCCCACTAGCTGATCTCAATTTAGGACTGCTATTCCTAATTTCCATATCCAGCTTTATAGTTTACTCCATTCTATGATCAGGCTGAGCCTCAAACTCCAAATACGCCCTAATAGGAGCCCTACGAGCAGTCGCCCAAACAATCTCATATGAAGTAACCCTAGGAATTATCCTACTCTCACTAACCCTATTTTCAGGTGGATTCAACATACAAACATTTATAACAACACAAGAACCAATATACCTAATATTCTCCTCCTGACCACTAATAATAATATGATATATCTCCACATTAGCAGAAACTAACCGAGCACCATTCGACCTTACCGAAGGAGAATCTGAACTAGTATCAGGATTTAACGTTGAATACGCCGCCGGACCATTCGCCCTATTCTTTCTAGCAGAATATGCCAACATCCTAATAATAAACACCCTAACCACCATTCTATTCCTAAATTCAACTTACACCAACAACCCCGAACTATTCTCCGTATTACTAATATCAAAAGTAATACTACTATCAGGAGGCTTCCTATGAATTCGAGCCTCCTATCCACGATTCCGATATGACCAATTAATACATCTCCTATGAAAAAACTTCCTCCCAATCACCCTTGCATTATGCCTATGACATACTTCCATACCAATCACCTTCTCAGGCCTACCACCTATACCCTAGGACACGTGCCTGAACAAAGGGTTACCTTGATAGGGTAAATAATAGAGGATAAAACCCTCTCGTCTCCTTAGAAAAATAGGACTTGAACCTACACCAGAGAGATCAAAACTCCCCATACTCCCATTATACTATATCCTAGTAAAGTCAGCTAATTAAGCTTTCGGGCCCATACCCCGAAAATGTCGGTTAAAATCCCTCCTATACTAATGAATCCCTACGCAAATACAATCATCATCTCAAGCCTAATTATAGGGCCCCTATTAACAATTTCCAGTAATCACTGAATCCTGGCATGAACTGGCCTAGAAATCAGCACGCTAGCCATTATTCCCCTAATCGCTAAACAACACCACCCACGAGCAACTGAAGCCGCCACTAAATACTTTCTAACACAAGCAACTGCCTCAACACTAATCCTATTTTCCAGCATTATTAATGCCTGAACACTAGGCCAATGAGACATTACACAAATATCCAACAATACCTCATGCACAATCCTCACTACAGCCTTAGCCATTAAACTAGGACTAGCTCCCTTCCACTTCTGGTTACCAGAAGTAATACAAGGAACCTCCACAATAACAGCCCTAATCTTAGCTACCTGACAAAAACTAGCCCCACTATCACTACTAACAATAACCGCCCAATCCCTAAACACACCACTACTATTAATACTAGGACTAACATCCGCTCTAATCGGAGGATGAAATGGACTAAATCAAACCCAACTACGAAAAATCATAGCATTCTCCTCCATCGCCCATCTAGGATGAATAGCTACAATCCTTACTCTATCCCCCAAACTTATACTACTTACATTTTACACCTACACCATCATAACCTCAACAATATTTTTAATAATAAAACTTCTAAAAACAAACAAAATCTCTATGATAATAACCTCATGAACAAAACTCCCAACCATAAACACCCTAATAATACTCACCCTTATATCACTCGCAGGCATACCACCACTAACAGGATTTATACCAAAATGACTAATCCTCCAAGAATTAACTAAACAACATATACTCATTATAGCCACTATAATAGCCATACTCTCACTCCTAACCCTATTCTTCTACCTACGAATTTCATACTATGCAACCATCACATTACCACCAAACTCAACCAACTATTCACAACAATGACGCCACAAAATAAACCAAAAACCCCCCTACCTAGCCCTATTAACCACACTATCAACCATTATACTTCCAATTATACCAACCTTATTAACCATCCCATAGAAACTTAGGATCAGACCTATTTTAAACCAGAGGCCTTCAAAGCCTCAAACAAGAGATAAAACCTCTTAGTTTCTGCTAAGACCTACAGGACTTTATCCCATATCTCATGAATGCAACTCAAACACTTTAATTAAGCTAAGGCCTTACTAGACAAATGGGCCTCGATCCCATAACAATTTAGTTAACAGCTAAATACCCAATCCAGCGGGCTTTTGCCTACTTTTCCCGCTCTATAAAAAGCGGGAAAACCCAGACACCAATAAAGGTGTATCTTCAAATTTGCAATTTAACATGAACTTCACTACAAGGTCTGATAGGAAGAGGAATTAAACCTCTGTAAAAGGGACTACAGCCCAACGCTAATACACTCAGCCACCCTACCTGTGTTTTTAACCCGTTGATTCTTTTCCACCAACCATAAAGACATCGGCACCCTATACCTAATCTTCGGGGCATGAGCAGGAATAGTAGGCACAGCACTCAGCCTATTAATCCGCGCAGAACTAAGCCAACCAGGTACTCTTCTAGGAGATGACCAAATTTACAACGTTATCGTCACAGCCCATGCTTTCATCATAATCTTCTTTATAGTTATACCAATCATAATCGGTGGCTTCGGAAACTGACTTGTTCCATTAATAATTGGAGCACCAGATATAGCATTCCCACGTATAAACAACATAAGCTTTTGACTCTTACCTCCTTCATTATTACTACTTCTAGCATCATCAGGAATTGAAGCAGGCGCAGGCACAGGCTGAACAGTGTACCCCCCATTAGCTGGAAACCTAGCCCACGCCGGTGCTTCTGTAGATCTAACCATCTTCTCCCTCCACCTAGCCGGTGTGTCTTCAATTTTAGGTGCTATCAACTTCATCACCACAGCAATCAACATAAAATCCCCTGCCATATCACAATACCAAACACCCTTATTCGTATGATCCGTACTAATTACAGCTGTCCTATTACTACTTTCTCTACCAGTACTCGCTGCAGGTATTACCATACTACTCACAGACCGAAATCTAAATACAACCTTCTTTGATCCTTCAGGGGGAGGAGATCCAATCCTATATCAACACCTATTCTGATTCTTTGGACACCCTGAAGTATACATCCTAATCCTCCCAGGATTCGGCATAATCTCCCACATTGTCACCTATTATGCCGGCAAAAAAGAACCATTCGGCTACATAGGAATAGTTTGAGCAATAATATCCATTGGTTTCCTAGGCTTCATTGTATGAGCTCACCACATATTCACCGTTGGAATAGACGTGGATACACGAGCTTACTTCACATCTGCAACAATAATCATTGCCATTCCAACAGGAGTAAAAGTATTCAGCTGATTAGCCACCCTACATGGTGGAATAATTAAATGAGATGCCGCCATACTCTGAGCCCTAGGCTTTATCTTTCTTTTCACTATTGGAGGATTAACAGGTATTGTATTAGCCAATTCATCATTAGACATTGTACTACACGATACCTACTACGTAGTAGCACATTTCCACTATGTTCTTTCAATAGGGGCCGTATTTGCTATCATAGCAGGATTCACTCATTGATTCCCCCTTTTCACAGGATATTCACTACACCAAACCTGAACAAAAGTACATTTCGGAGTAATATTTACAGGCGTTAACATAACCTTCTTCCCCCAACATTTCTTAGGACTAGCTGGAATACCACGACGCTACTCAGATTATCCAGATGCATACACCCTATGAAACTCCATCTCATCAATCGGATCTTTAATTTCTATAGTAGCAGTAGTTATAATAATATTTATTATTTGAGAAGCATTTTCCTCAAAACGAAAAGTATCAACAGTAGAACTCACAACCACCAACGTAGAATGACTACACGGCTGCCCTCCCCCATATCACACCTACGAAGAACCAGCCCATGTACAAACCCAAGAAAGGAGGGAATCGAACCCCCTTAAATTAGTTTCAAGCCAACCACATAGCCTCTATGTTTCCTTCTTTAAAGACGTTAGTAAAACATATTACCTAACCTTGTCAAGGTTAAATTATAGGTGAAATCCCTTTACGACTTAATGGCACATCCCCTTCAATTAGGATTCCAAGATGCAATATCACCAATTATAGAAGAACTCCTTCATTTCCACGACCACACTTTAATAATTGTATTCTTAATTAGCACCCTAGTACTCTATATCATCACACTAATAATAACAACAAAACTAACATACACCAACACTATAAATGCTCAAGAGGTAGAAATAATTTGAACCATTTTACCAGCTATTGTCTTAATTACTATTGCACTCCCATCGCTACGAGTACTATACCTAATAGACGAAATCAATAACCCACATTTAACCATCAAAGCCATAGGACATCAATGGTATTGAACATACGAATATACTGACTACGAAAACCTTGAATTCGACTCTTACATAATTCCAACCCAAGATCTACCAAACGGACACTTCCGATTACTAGAAGTAGATCACCGCATAGTAATACCAATAGAATCACCAATCCGAATATTAATCTCAGCTGAAGACGTCTTACACTCATGAGCAGTACCATCACTAGGCGTAAAAACAGATGCAATCCCAGGACGATTAAATCAAGCAACATTCATCATTACCCGACCAGGAGTATTCTTTGGACAATGTTCAGAAATTTGTGGAGCCAACCACAGCTTCATACCAATCGTAGTAGAATCCGTACCCTTATCACACTTTGAAGACTGATCTTCATTAATGCTTTCCTAACACTATAGAAGCTAAACAGGATAGCGCTAGCCTTTTAAGCTAGAAAAAGAGAACTCCCCACTCTCCTTAGTGACATGCCTCAACTAAACCCAAATCCGTGACTTATAATCTTACTCTCCGCATGACTAATTTACACCATTATTTTACAACCAAAAATCTCATCTTACTTACCTACAAATAACCCAACCAACAAAAACAATAAAACCAACACAAACCCCTGAACCTGACCATGAACCTAACATTCTTCGACCAATTCATAAGCCCACAAATCCTAGGAATCCCATTAACTACCCTAGCCCTACTAATACCATCAACCCTCTTCCCCACCCAAAACACCCGATGATTAACTAACCGTCTTTCAACACTCCAATCATGAACAATCAACTCATTCACAAAACAACTAATACTTCCAATTAATAAAACAGGCCACCAATGATCCATTATCCTAACATCATTAATAACCATACTATTAATAATCAATTTACTAGGCCTTCTACCATACACCTTCACCCCTACTACACAACTTTCCTTAAACATAGGACTAGCCATCCCAATATGACTAGCCACAGTACTCACAGGACTTCGAAATCAACCAACAACATCACTAGGACACCTATTACCAGAAGGCACCCCAATCCTATTAACCCCTATTCTCATTATCATTGAAACAATCAGCTTATTCATCCGACCATTAGCTTTAGGCGTACGACTCACAGCCAACCTAACAGCCGGACATCTACTAATTCAACTTACCTCAACCGCAGTACTAACCCTACTTCCAATAATACCTACTCTATCATTACTAACTATAGTTATTCTTTTATTACTAACAATTCTAGAATTAGCCGTAGCCATAATTCAAGCTTACGTATTTGTTCTTCTATTAAGCTTATATTTACAAGAAAACACTTAATGGCCCACCAAATACACGCCTACCACATAGTTGACCCAAGCCCATGACCACTAACAGGGGCAGCAGCAGCACTACTAATAACCTCAGGACTCGCCACATGATTCCACTACAACTCAACACTATTAATAACCCTAGGCTTACTAACTATACTTCTAACCATATTTCAATGATGACGAGATATCATCCGAGAAGGAACCTTCCAAGGACATCACACACCCCCAGTACAAAAAGGCTTACGATATGGTATAATCTTATTTATTACATCAGAAGTATTTTTCTTCATCGGATTTTTCTGAGCTTTTTACCATTCAAGTCTAGCCCCTACCCCAGAACTAGGAGGATGTTGACCACCTACAGGAATCACACCACTAAACCCATTTGAGGTCCCCTTATTAAACACAGCAGTATTATTAGCATCAGGAGTAACAATCACCTGAGCTCACCATAGCCTAATAGAAATAAACCGAAATCAAACCACCCAAGCCCTAACTATCACAATTTTACTAGGACTATATTTCACAGCATTACAAGCTATAGAATATTACGAAGCCCCATTCACAATCGCCGATGGAGTGTACGGCTCAACATTTTTTGTCGCAACAGGCTTTCACGGACTCCACGTAATTATTGGCTCAACATTCCTAATTGTTTGCTTACTACGACAAATTAAATTCCATTTTACCTCCACTCACCACTTCGGATTTGAAGCAGCCGCCTGATACTGACACTTCGTAGACGTTGTATGACTCTTCCTCTACGTTTCAATCTACTGATGAGGCTCATGCTCCCCTAGTATAACAGTACAAGTGACTTCCAATCACTAAGTTTTAGTTCAACCCTAAAGAAGAGCAATGAACGTAACAACATCCATCATCACAATAGCCTCCATTCTCTCCATAATCCTAATAATATTAAACTACCAATTAACATTAACAAAACCAGATAACGAAAAATTATCCCCATATGAATGTGGCTTTGACCCACTAGAATCAGCCCGTCTACCATTCTCAATTCGATTTTTCCTTAGTAGCCATCTTATTCCTGCTATTCGACTTAGAAATTGCACTACTCCTACCACTACCATGAGCCACCCAACTCCCATCTCCAACCTCAACCCTTACCTGAACCATTATTATTTTACTTCTCCTAACACTAGGCCTTATTTATGAATGAATCCAAGGAGGCTTAGAATGAGCAGAATAGGCAACTAGTCTAACACAAGACAACTAATTTCGACTTAGTTAATCATGATTAAACTTCATGGTTTCCCAATGACACCATTACACTTCAGCTACCACTCCGCCTTTATTATTAGCATTATAGGCCTCTCATTACACCGAACCCACTTAATCTCAACTCTACTATGTCTAGAAAGCATAATATTATCCCTATTCATTGCTCTAGCAATATGACCCACCCAATTACAAACCCCATCACTTATAATCACCCCAATACTAATACTATCCTTCTCAGCCTGTGAAGCTGGCATAGGCCTATCTTTATTAGTAGCATCCTCACGAACCCATGGTTCAGACCAACTACAAAACCTAAACCTTTTACAATGCTAAAAATCCTACTCCCTACAATTATATTATTACCAACAATCACACTATGTAAACCAAAACAACTATGACCTTCCACATTAATCCATAGCCTAACAATTGCTACTCTAAGCTTACAATGATTTAAACCTTCCATAGAACCAACCATAAACTTTTCTAATAGCTATCTAGGAATAGACCAAACCTCAGCCCCACTATTAATCCTATCATGCTGACTTACCCCTATAATAATCTTGGCTAGCCAAAACCACTTAGCTACCGAACCAACCTCACGAAAACGAACCTTCACCTTCACTATCATCTCACTACAAATTTCACTAATACTAGCTTTCTCAACCATAGAACTAATTATATTTTTTATTGCATTTGAAACTACACTTATCCCAACACTAATAATTATCACACGATGAGGCAACCAAATAGAACGACTAAATGCAGGAACTTACTTTCTATTTTATACCCTCATTGGATCTCTTCCACTACTAATCGCTCTACTATCCCTAAACACTGAAAACGGTTCCTCATCAATATACACAATACAACTAAATCAACCTATCATACCAAACTCATGAACCCATACAACATGATGATTCGCACTACTAATAGCTTTTATAATCAAAATACCACTATACGGTTTACACTTATGACTACCAAAAGCACATGTAGAAGCCCCAATTGCAGGCTCAATAATTCTAGCTGCAGTATTACTAAAACTAGGAGGATATGGTATCATCCGCATTACAATAATACTAAATCCCCTGTCAAAAACACTTTCCTACCCTTTTATGGTACTCGCATTATGAGGAGTAATCATAACTAGCTCTATCTGCTTACGACAAACAGATCTAAAATCACTAATTGCCTATTCATCAGTAAGCCATATAGGCCTTGTCATCGCCGCAACACTAACACAAACTCAATGAGCCTACACCGGCGCAATTACACTTATAATCGCTCATGGCTTAACATCATCAATACTTTTTTGCCTAGCTAATACAAATTACGAACGAACTCACAGCCGAACACTACTATTAGCCCGAAATATACAACTCCTACTCCCCCTAATAAGCCTATGATGACTACTTGCTAGTTTAACTAACATAGCCCTTCCACCAACCATTAACCTAATAGGAGAATTAACCATTATTACTTCACTATTTAACTGATCCAACATTACAATCCTAATAACAGGATTAGGAACCCTAATCACCGCTACTTACACCCTATACATATTATCTACAACACAATGAGGGGAAACACCTTCATACATCAAAACTATCCCCCCAACCCACACACGAGAACATCTCTTAATATCATTACACATCCTACCAATAATTCTACTAATAACAAAACCAGAACTAATCTGAGGCTCCTTCTACTGTTAATATAGTTTCAAAACAAACATTAGACTGTGGCTCTAAAAATAGGAGTTAAAATCTCCTTATAAACCGAGAGAGGTATAATACAATAAGAACTGCTAACTCCTATATCTGAGATTAATCCCTCAGCTCCCTCACTTTTAAAGGATAGAAGTAATCCACTGGTTTTAGGAACCACGAACCCTTGGTGCAATTCCAAGTAAAAGTAATGACCACACTAATAAATTCAACCTTCCTCTTAGCCCTAATCACCCTAATATTTCCACTAACAACAACTTACCCAAAAACATGAACACCTCTAAAAACAAAAACAGCTGTAAAAATAGCATTCTTCATCACCCTAATCCCACTAATCGCCTTCATTTATACAGACATTGAATCTGTTATCACCAACCTACACTGATCAACCACATCCACATTCACCATAAACATAAGCTTTAAACTTGACAAGTACTCCATCATGTTCGTCCCAATCGCCTTATACGTCACATGATCCATCCTAGAATTTACACACTGATACATAGCTACTGACCCTTATATCACAAAATTTTTCAAATACCTACTAATTTTCCTAGTAGCCATAATAATCCTAGTAACAGCCAACAACATATTTCAATTCTTTATTGGCTGAGAAGGAGTAGGAATCATATCCTTCCTCTTAATCGGATGATGATCCGGCCGAACAGAAGCAAACTCATCAGCCCTACAAGCCATTATTTACAACCGTATCGGAGACATCGGACTAATCCTCAGTATAGCCTGACTATCAATAAACCTAAACACATGAGAACTCCAACAAATCTTTACCCACACCAATCTCACCCCACTACTTCCACTCCTAGGATTAATCCTAGCCGCAACAGGAAAATCAGCCCAATTCGGCCTCCACCCCTGACTACCAGCAGCTATAGAAGGCCCCACCCCAGTTTCAGCATTACTACACTCAAGCACTATAGTAATCGCTGGAATCTTCCTACTAATCCGAATACACCCCATCTTAACCACCAACAACACAGCCCTCTCAACCTGCCTTTGCCTAGGAGCCATCACCACACTATTTACAGCTTTTTGCGCCCTCACCCAAAATGATATCAAAAAAATTATTGCCTTCTCCACATCAAGCCAACTAGGCCTTATAATAGTAACTATCGGCCTAAACCAACCACAACTAGCCTTCCTACATATCTCCATACACGCATTCTTCAAAGCCATACTATTCTTATGCTCAGGTTCCATTATTCATAATCTAAACAACGAACAAGATATTCGAAAAATAGGAGGACTACACAAATCCCTACCAATTACCTCCTCATGCCTAACTATTGGTAGCATAGCACTCACAGGCATACCATTTATAACTGGATTCTATTCTAAAGACATCATTATCGAAACCATAAATACATCATACATAAACGCCTGAGCCCTACTCCTAACACTAACCGCAACCTCATTCACTGCAATTTACAGCTTTCGCATCTTAATCTTCGTACAAACAGGACACCCACGATACCCCTCCACACTCCTACTAAACGAAAACAACCCAACAATTATCAACCCAATCACCCGCCTCGCAATAGGAAGCATCGTCGCAGGCTTACTCATTTCACTAAATATCACACCATTAAAAACTCCCCCAACAACTATACCAACATACATTAAAATCACGGCACTAACAGTAACAATCCTAGGCCTACTACTAGCCTTAGAACTAATCACAATAGTAAACAAAACCCAAAAACCCTCCAACACCCATAATTTCTCAAATTCACTAGCATACTTCAACACCCTAATACACCGTTCACTACCAATAGTGAACTTAAAATTTAGCCAAAACATCGCAACACATCTAATCGACCAGTCCTGATATGAAAACGTTGGCCCAAAAGGACTAAGCAAATCACAAATCACCCCAATTACAACTTCATCCGCGTCACAAAAAGGCCTCATTAAAATCTATATAACTTCATTCATCCTATCAATAACATTACTACTTCTCATCACCTAATCGGACGAAGTACCCCACGAGATAAACCACGAACCAACTCCATCACAACAAATAACGTCAACAATAACCCTCAACCAGCAATTAAAAACAACCAACTACCAAAATAATAAAACCATGCCACCCCACTAAAATCTAATCGAACAACAAACAACCCACCAGCATCAATTGTAACACTACCATACCCTTCCATACTCCACAGTCAATAACACATTGCCCCATACCCCACAAGAACTAAAATATAACTCACCACATAAATTATCCCACCTCGATTACCTCAAGCAACAGGATAAGGCTCTTCCACCAACGCAGATGAATAAGCAAAAATAACTAATATACCACCTAAATAAATTAAAAACAATACTACAGAAACAAAAGACCCCCCTATACTAACTAACAACCCACAACCAAAAGCCGCCCCAAGAACCAAACTTAAAACTCCATAATATGGGGACGGATTACAAGACACACCAACCATCCAAAAAACAAAACAAAACCCAAATAAAAATGCAAAATACATCATAATTCTTGCCTGGACTCTAACCAAGACCAATGATTTGAAAAACCACCGTTGTATTCAACTACAAAAACCTAATGGCCACAAACCTACGAAAAACCCACCCAATAATAAAAATCATCAACAATTCACTCATCGACTTACCAAGCCCCTCCAACATCTCTGCATGATGAAACTTCGGATCACTACTAGCCACCTGTCTAGCACTACAAATCATTACCGGAATCTTCCTAGCAATACATTACTCACCAGACATCTCCATAGCCTTTTCATCAATTACCCACATCACCCGAGATGTACAATACGGATGACTCATCCGCAACATGCACGCCAACGGAGCCTCCCTATTTTTCATCTGCATCTACCTCCACATCGGACGAGGAATCTACTACGGTTCCTATCTATACAAAGAAACCTGAAATACCGGAATCATCCTCTTACTACTAGTAATAGCCACCGCATTCGTAGGCTACGTCCTACCATGAGGGCAAATATCATTCTGAGGAGCTACCGTCATCACCAACCTACTATCAGCCATCCCATATATTGGTAACACATTAGTACAATGAATCTGAGGAGGATTCTCAGTAGACAACGCAACCCTAACTCGATTCTTCACCCTCCACTTCCTACTACCATTCGCCATTACCGGCCTTACAATAGTACATTTACTATTCCTACACGAAACAGGCTCAAACAACCCAACAGGACTAAACTCAAACACTGACAAAATCCCCTTCCACCCCTACTTCTCCTACAAAGACTTACTAGGACTTATCTTAATACTAACCTTTCTCCTAACCCTAACACTTTTCTCCCCATACCTACTAGGAGACCCAGACAACTTCACCCCAGCTAACCCCCTATCCACCCCTCCCCACATCAAACCAGAATGATACTTCCTATTCGCCTACGCAATCCTACGATCAATCCCAAACAAACTAGGCGGAGTACTAGCCCTACTATTCTCCATCCTAGTATTATTACTAATACCCACCTTACACACATCAAAACAACGAACAGCCTCATTCCGACCACTCACCCAAATCTTATTCTGATCCCTAGTAGCTGACCTACTAGTACTAACATGAATTGGAGGCCAACCAGTCGAAGATCCATTCATTACCATTGGTCAAATAGCCTCTAGCCTTTACTTCTTAATCTTACTTCTTCTAATACCCACAGCGGGCATAATCGAAAACAAAATACTAAAACTAAAATATTCTAGTAGCTTAACCCCAAAGCATTGGTCTTGTAAACCAAAGATTGAAAACTACAACTTTCCTAGAATAATCAAAAGAGAAGGGTTCAAACCTTCATCTCCGGTCCCCAAAACCGGAATCTTCCAATTAAACTACCCTTTGACGCAAAAGAAGCGCCAACATGTAAATTTACCTATATTCTCTGCCGTGCCCAACAGAATAATATCCATAATACCTATCTATGTATTATCGTACATCAACTTATTTACCACTAGCATATGATCAGTAATGTTGTCGATTAATCTGACCTTAAACATAAAAACTATTAATTTTGCATAAACTGTTTTAGTTACATGACTATTATACAGGTAATAGGAATGAAATGATATAGGACATAAAATTAAACCATTATTCTCAACCATGAATATCGTTACAGTAATAGGTTATTTCTTAGTTCAGCTCATCACGAGAAATAAGCAATCCTTGTTAGTAAGATACAATATTACCAGTTTCAAGTCCATTAAGTCATGTCGTACATAACTGATCTATTCTGGCCTCTGGTTGGTTTTTTCAGGCACATTAAGGCAGTAAAGTTCATTCGTTCCTCTTTAAAAGGCCTCTGGTTGCAAGTAAATGAGTTCTATACATTAAATTTATAACCTGGCATACGGTGGTTTTACTTGCATGTGGTAGTCTTTTTTTTCTCTTTGTGTTCTCAGGCCCACATAACTGATACCTGCCGAATTGATGAAACTGAGCCTACGTTCAAAATGATTGGCCGTGCAGAATACTTAATGGTATTATTTAATTAATGCTTTTAGGACATATATTTTTATAAAAACTCACAACAGTTATTTACAAGCTAAAACCCATTACAACCATACTTTTTAGTTAAACCCCCCCACCCCCATAAACTAACATTATGCCCGAATAGCTATTCACTTCTCGTCAAACCCCTAAATCCGAGACTAACTAAACTGACACAACATTAATCGCATAAGCATTACACAAACTAATGAAACACTTACACTATACCTAAAAAGTACTAAAAACAATTCATCACACCTCTACTACACCCAACTAACCAAACATTATATTATATTATATTATATTATATTATATTATATTATATTATATTATATTATATTATATTATATTATATTATATTATATTATATTATATTATATTATATTATATTATATTATATTATATTATATTATATTATATTATATTATATTATATTATATTATATTATATTATATTATATTATATTA


NOVOPlasty: The Organelle Assembler Version 4.3.5 Author: Nicolas Dierckxsens, (c) 2015-2024

Input parameters from the configuration file: Verify if everything is correct

Project:

Project name = 21103_1 Type = mito Genome range = 15500-18000 K-mer = 50 Max memory = Extended log = 0 Save assembled reads = no Seed Input = /mnt/raid/users/livia/Caretta23/novoplasty_pipeline/seed/seed_COI.fa Extend seed directly = no Reference sequence = /mnt/raid/users/livia/Caretta23/novoplasty_pipeline/ref_mito/ZAK51.fasta Variance detection = yes Chloroplast sequence =

Dataset 1:

Read Length = 151 Insert size = 300 Platform = illumina Single/Paired = PE Combined reads = Forward reads = /mnt/raid/users/livia/Caretta23/Trimmomatic_output/Trimmomatic_output_new/21103-1_S63_f_paired.fastq.gz Reverse reads = /mnt/raid/users/livia/Caretta23/Trimmomatic_output/Trimmomatic_output_new/21103-1_S63_r_paired.fastq.gz Store Hash =

Heteroplasmy:

MAF = HP exclude list = PCR-free =

Optional:

Insert size auto = yes Use Quality Scores = Reduce ambigious N's = Output path = /mnt/raid/users/livia/Caretta23/novoplasty_pipeline/output_files/output_failed_samples/

Subsampled fraction: 95.79 % Forward reads without pair: 782896 Reverse reads without pair: 340107

Retrieve Seed...

Initial read retrieved successfully: ACCCTACCTGTGTTTTTAACCCGTTGATTCTTTTCCACCAACCATAAAGACATCGGCACCCTATACCTAATCTTCGGGGCATGAGCAGGAATAGTAGGCACAGCACTCAGCCTATTAATCCGCGCAGAACTAAGC

Start Assembly...

-----------------Assembly 1 finished successfully: The genome has been circularized-----------------

Contig 1 : 15553 bp (Check manually if the two contigs overlap to merge them together!) Contig 2 : 1072 bp

Total contigs : 2 Largest contig : 15553 bp Smallest contig : 1072 bp Average insert size : 276 bp

-----------------------------------------Input data metrics-----------------------------------------

Total reads : 20305262 Aligned reads : 17592 Assembled reads : 12438 Organelle genome % : 0.09 % Average organelle coverage : 160


ndierckx commented 4 months ago

In this case, the AT repeat caused the assembly to stop, illumina quality and depth usually drops a lot there. If the assembly is uncertain, it will use the paired reads information to jump the problematic region and continue from there in both senses. Sometimes it merges automatically, sometimes it asks you to do manually. I guess the second contig also started with the AT repeat?

LiviaT93 commented 4 months ago

Well, not really. First contig starts with AT repeats and stops more or less at the end of Val tRNA (which is around 1000 bp after the end of AT repeats); the second contig starts where the first finishes, and finishes in the AT repeats. The curious thing that I was asking you something about, is why the interruption is around Val tRNA. This occurs in several samples of mine.

Il giorno dom 3 mar 2024 alle ore 16:17 Nicolas Dierckxsens < @.***> ha scritto:

In this case, the AT repeat caused the assembly to stop, illumina quality and depth usually drops a lot there. If the assembly is uncertain, it will use the paired reads information to jump the problematic region and continue from there in both senses. Sometimes it merges automatically, sometimes it asks you to do manually. I guess the second contig also started with the AT repeat?

— Reply to this email directly, view it on GitHub https://github.com/ndierckx/NOVOPlasty/issues/224#issuecomment-1975194488, or unsubscribe https://github.com/notifications/unsubscribe-auth/BGLUAP7LJTV37Y6Q3XD2LOLYWM5JPAVCNFSM6AAAAABDVKGGZSVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTSNZVGE4TINBYHA . You are receiving this because you authored the thread.Message ID: @.***>

ndierckx commented 4 months ago

I can only tell why it terminated by seeing the extended log file, need to run it with that option set to 1 If the question is biologically, I can't help you there, I am not a specialist