mitoNGS / MToolBox

A bioinformatics pipeline to analyze mtDNA from NGS data
http://sourceforge.net/projects/mtoolbox/?source=navbar
GNU General Public License v3.0
86 stars 37 forks source link

Getting Error in "ASSEMBLING MT GENOMES WITH ASSEMBLEMTGENOME..." step #108

Closed Nirmal2310 closed 2 years ago

Nirmal2310 commented 2 years ago

Hi, I opened an issue in December haven't got any response so opening another issue. Please help me out if possible. I am using MToolBox for assembling human mitochondrial genome with simulated reads as input. I am getting an error in the assembling part of the pipeline and I am attaching the configuration file as well as the log file of the run here. I request you to please suggest what is wrong with my pipeline and how can I resolve it. I will be very thankful to you all. My code: MToolBox.sh -i sim_hiseq_exponential.conf -a "-t 16" -m "-t 16" Config file: sim_zeroinflatedlognormal.txt The total reads are 250K per fastq file. log file: mtoolbox.log Please tell me if you need anything else from my side. Thank you very much.

clody23 commented 2 years ago

Hi @Nirmal2310 , apologies for the huge delay in replying. From your log file it seems there's no error, but that the simulated data do not return any VCF file and consequently any annotation/haplogroup prediction.

Did you use rCRS to generate the simulated dataset? If so, it could be that because your sim data are based on that and the reference sequence you choose is also rCRS, you are not getting any variant allele out of the variant calling. This could explain warning in the log file ("Heteroplasmy data file ('VCF_dict_tmp') not found. HF will not be reported in the output.") and the fact that you did not get any VCF file.

We will improve the error handling in this variant calling step in the new version of the pipeline: https://github.com/mitoNGS/MToolBox_snakemake/issues/15.

Please let us know whether this responds to your issue. Best wishes, Claudia

Nirmal2310 commented 2 years ago

Hi, thank you for the response. Indeed I simulated the data using rCRS and thank you for clarifying the doubt. One more help if I may ask, is it possible to generate the assembled mitochondrial genome from the simulated data using MToolBox if we are not getting any variant alleles like this case. Thank you in advance.

clody23 commented 2 years ago

Hi @Nirmal2310, no at the moment is not possible (and not necessary in my opinion) to generate an assembled chromosome if there are no variant alleles identified. MToolBox performs a reference-guided assembly, so if no variants are found, the assembled genome will be identical to the reference sequence itself, so you don't need to generate a new sequence, because the reference sequence you used represents already the genome of your target organism .

Does this clarify your point?

Nirmal2310 commented 2 years ago

Ok I got the point. Thank you so much for explaining this to me. So with that I am closing this issue. Once again thank you so much.