ndierckx / NOVOPlasty

NOVOPlasty - The organelle assembler and heteroplasmy caller
Other
170 stars 62 forks source link

Circularizing contig, expected to be larger than actual genome size #47

Closed Ajit81 closed 4 years ago

Ajit81 commented 6 years ago

Hi Nicolas, I got 1 contig after running NOVOplasty, with size >20kb. When annotated with reference genome, I found some duplicate CDSs are there in reverse direction. I extracted regions contained duplicate CDS, reverse complimented and aligned. Finally, I got around 15kb mt DNA. Annotated again with MITOS, I found all required genes are present.

I would like to know, whether I can use this as circularized genome or not. Because when running NOVOplasty again by using the finally extracted 15kb mtdna, it didn't circularize.

Kindly give your suggestion.

Thanking you

Ajit

ndierckx commented 6 years ago

Hi,

Sorry for the late response, but I am on holiday with limited internet connection. You could run again with extended log option to 1 and send me the log and extended log. With this information it is hard to say if you have the complete sequence. Which species is it and how close is that reference genome? And I don't think NOVOPlasty would incorrectly add the reverse compliment..

Greets,

Nicolas

Ajit81 commented 6 years ago

Hi Nicolas, I have used the COI gene sequence of same the species Hyalella azteca. And the extracted mtdna size is 20018 bp. I have attached the log and extended log file for your reference.

Thanking you

Ajit

Ajit Kumar Patra Post-Doc Research Scientist, Team of Marine Bio-Informatics National Marine Biodiversity Institute of Korea 75, Jangsan-ro 101beon-gil, Janghang-eup, Seocheon-gun, Chungcheongnam-do, Korea, 33662 Tel: +82-41-950-0785 <041-950-0785> Mob: +82-10-3177-9805 <010-3177-9805>

On Mon, May 7, 2018 at 2:42 AM, Nicolas Dierckxsens < notifications@github.com> wrote:

Hi,

Sorry for the late response, but I am on holiday with limited internet connection. You could run again with extended log option to 1 and send me the log and extended log. With this information it is hard to say if you have the complete sequence. Which species is it and how close is that reference genome? And I don't think NOVOPlasty would incorrectly add the reverse compliment..

Greets,

Nicolas

ndierckx commented 6 years ago

you should attach the files, not copy the content in the text box.

Greets

Ajit81 commented 6 years ago

Hi Nicolas, I actually sent through my email and attached log file there. But, it copied in the GitHub text box automatically. I am sorry for the inconvenience. Please find here the log and extended log file attached.

Thanking you

With best regards

Ajit log.txt log_extended_Hyalella_mtdna.txt

ndierckx commented 6 years ago

Hi,

How close is the reference you used? And could you run again with version 2.6.7 and use genome range up to 30000

Ajit81 commented 6 years ago

Hi Nicolas, I have used same species (Hyalella azteca) COI gene as seed input and Parhyale hawaiiensis as Reference ( COI gene similarity is 77%). I used genome range up to 30000 and than 40000.

I got one contig size 22730 bp. When annotated, I found same CDSs are in reverse direction as mentioned earlier. I am attaching the annotation file along with log and extended log files for your reference.

I used version 2.6.7 this time.

Thanking you

Ajit Contig01_10092522 Annotations.txt log.txt log_extended_Hyalella_mtdna.txt

ndierckx commented 6 years ago

Hi,

I made some improvements, so could you try again with version 2.6.9 :) And put genome range up to 40000

ndierckx commented 6 years ago

Ow sorry there was a bug in 2.6.8 and 2.6.9 that read the descriptions of the config file again, should be fixed with 2.7.0