The annotation.csv file exists and contains annotations (
(...)/OUT_SampleXXX/SampleXXX.T2c1c2.annotation.csv
) but MToolBox does not see it.
The MToolBox_SampleXXX.log log file reports: "No annotation.csv found. Exit"
Here is the MToolBox_SampleXXX.log file content.
Thank you for your help.
note: Sample name have been replaced by SampleXXX, and paths are masked.
(...)
setting up MToolBox environment variables...
...done
setting up MToolBox variables in config file ...
...done
Check python version... (2.7 required)
OK.
Checking files to be used in MToolBox execution...
Checking mapExome parameters...
OK.
Checking assembleMTgenome parameters...
OK.
Checking mt-classifier parameters...
OK.
Input type is fastq.
output files will be placed in (...)
EXECUTING READ MAPPING WITH MAPEXOME...
mapExome for sample SampleXXX, files found: SampleXXX.fastq
Mapping onto mtDNA...
(...)
Extracting FASTQ from SAM...
Mapping onto complete human genome...single reads
Reading Results...
Filtering reads...
Outfile saved on (...)
Done.
SAM files post-processing...
SORTING OUT.sam FILES WITH PICARDTOOLS...
Success.
REALIGNING KNOWN INDELS WITH GATK INDELREALIGNER...
Done. There were no warn messages.
Success.
Skipping Mark Duplicates...
ASSEMBLING MT GENOMES WITH ASSEMBLEMTGENOME...
WARNING: values of tail < 5 are deprecated and will be replaced with 5
[mpileup] 1 samples in 1 input files
Set max per-file depth to 8000
##### GENERATING VCF OUTPUT...
Reference sequence used for VCF: RCRS
##### PREDICTING HAPLOGROUPS AND ANNOTATING/PRIORITIZING VARIANTS...
Haplogroup predictions based on RSRS Phylotree build 16
Your best results file is mt_classification_best_results.csv
Loading contig sequences from file SampleXXX-contigs.fasta
Loaded 1 contig sequences
Aligning Contigs to mtDNA reference genome...
Sequence haplogroup assignment
Classification according to tree: (...)/MToolBox-master_v1.0/MToolBox/data/phylotree_r16.pickle
genome_state is incomplete
OrderedDict([('T2c1c2', (57, 63, 63))])
====================
I'm looking for T2c1c2
------------------------------
Contig alignment to MHCS and rCRS
Aligning contigs to MHCS SeqDiff object
Writing results for sequence SampleXXX
Parsing pathogenicity table...
Parsing variability data...
Parsing info about haplogroup-defining sites...
Parsing info about haplogroup assignments...
(...)/SampleXXX_merged_diff.csv
Parsing variant data for sample SampleXXX...
Best haplogroup predictions for sample SampleXXX : ['T2c1c2']
Functional annotation for haplogroup T2c1c2
Success.
No annotation.csv found. Exit
*************************
my configuration file is:
#!/bin/bash
mtdb_fasta=chrM.fa
hg19_fasta=hg19RCRS.fa
mtdb=chrM
humandb=hg19RCRS
######################SET PATH TO INPUT/OUTPUT and PRE/POST PROCESSING PARAMETERS############################
##
##OPTIONAL. Specify the FULL PATH of the input directory. Default is the current working directory
##
input_path=somewhere
##
##OPTIONAL. Specify the FULL PATH of the output directory. Default is the current working directory
##
output_name=(...)/SampleXXX
##
##OPTIONAL. Specify the FULL PATH to the list of files to be analyzed if input_path was not defined. Default is use all the files with the specified file format extension
##in the current working directory and skip this option
##
list=(...)/SampleXXX/list.lst
##
##MANDATORY. Specify the input file format extension. [fasta | bam | sam | fastq | fastq.gz]
input_type=fastq
##MANDATORY. Specify the mitochondrial reference to be used for the mapping step with mapExome. [RCRS | RSRS; DEFAULT is RSRS]
ref=RCRS
##OPTIONAL. Specify if duplicate removal by MarkDuplicates should be set on. [false | true; DEFAULT is false]
UseMarkDuplicates=false
##
##OPTIONAL. Specify if realignment around ins/dels should be set on. [false | true; DEFAULT is false]
UseIndelRealigner=true
##
##OPTIONAL: specify if to exctract only mitochondrial reads from bam file provided as input. [false | true; DEFAULT is false]
MitoExtraction=false
MToolBox v1.0
The annotation.csv file exists and contains annotations ( (...)/OUT_SampleXXX/SampleXXX.T2c1c2.annotation.csv ) but MToolBox does not see it.
The MToolBox_SampleXXX.log log file reports: "No annotation.csv found. Exit"
Here is the MToolBox_SampleXXX.log file content. Thank you for your help.
note: Sample name have been replaced by SampleXXX, and paths are masked.
(...) setting up MToolBox environment variables... ...done
setting up MToolBox variables in config file ... ...done
Check python version... (2.7 required) OK.
Checking files to be used in MToolBox execution...
Checking mapExome parameters... OK.
Checking assembleMTgenome parameters... OK.
Checking mt-classifier parameters... OK. Input type is fastq. output files will be placed in (...)
EXECUTING READ MAPPING WITH MAPEXOME...
mapExome for sample SampleXXX, files found: SampleXXX.fastq Mapping onto mtDNA... (...) Extracting FASTQ from SAM... Mapping onto complete human genome...single reads Reading Results... Filtering reads... Outfile saved on (...) Done.
SAM files post-processing...
SORTING OUT.sam FILES WITH PICARDTOOLS...
Success.
REALIGNING KNOWN INDELS WITH GATK INDELREALIGNER...
Done. There were no warn messages.
Success.
Skipping Mark Duplicates...
ASSEMBLING MT GENOMES WITH ASSEMBLEMTGENOME...
WARNING: values of tail < 5 are deprecated and will be replaced with 5
[mpileup] 1 samples in 1 input files