ndierckx / NOVOPlasty

NOVOPlasty - The organelle assembler and heteroplasmy caller
Other
173 stars 62 forks source link

the contig is the same as the seed #155

Open yeseniavegasanchez opened 3 years ago

yeseniavegasanchez commented 3 years ago

Hello, I'm trying to get the mitogenome of a damselfly. I have ddRAD 2x138 bp. I'm using a COI fragment (629 bp) as seed, from the same sample. When I set the "extended seed directly" as no, the log says that the seed is invalid, but when I set as yes it works... but, the results are just one contig that basically is the sequence of the seed... is there something wrong with my config file? or what do you think that is the problem? I've attached the extended log file. log_extended_mitoNR1.txt

Thank you in advance Best,

Project:

Project name = mitoNR1 Type = mito Genome range = 12000-18000 K-mer = 21 Max memory = Extended log = 1 Save assembled reads = yes Seed Input = Seed.fasta Extend seed directly = yes Reference sequence = Vestalis_melania.fasta Variance detection = Chloroplast sequence =

Dataset 1:

Read Length = 138 Insert size = 200 Platform = illumina Single/Paired = PE Combined reads = Forward reads = NR1.1.fq Reverse reads = NR1.2.fq

Heteroplasmy:

Heteroplasmy = HP exclude list = PCR-free =

Optional:

Insert size auto = yes Use Quality Scores =

ndierckx commented 3 years ago

If it is the same length as the seed, it means it didn't assemble anything. That option should only be used if you want to extend assemblies. So you have RAD seq data, will the complete genome by covered with this method? I have no experience with it

roomfortwo commented 3 years ago

Hey @yeseniavegasanchez , I was having the same issue and the problem is either the k-mer length or the seed (you might need something more resolutive). Using COI as your seed might be problematic if the species your studying have a crazy amount of sequences associated to numts or the flanking sequences of COI contains repetitive sequences (that was my problem), I hope it'll help,
Cheers,