Closed peflanag closed 3 years ago
Dear Peter,
as a short detour, I would strongly recommend to keep using the default reference genome of H37Rv for all M. tuberculosis complex samples. All genomes inside the M. tuberculosis complex are so similar, that you will have >99% of the H37Rv reference genome covered with NGS reads from M. bovis samples. Therefore, switching to a M. bovis genome is not necessary to improve resolution power. This will also allow you to keep the predefined annotation information regarding repetitive regions and resistance-associated genes and specific mutations.
If you want to switch to another reference genome, you also need to create the _genes.txt file. I agree that the relevant part of the manual could be understood in that this file is optional: "_In order for MTBseq to provide gene annotations, a respective annotation file with the extension _genes.txt needs to be placed in the same directory. For file formatting, follow the example of the existing annotation files, e.g. in the M._tuberculosis_H37Rv_2015-11-13_genes.txt file." However, it is sufficient to create an empty text file if you do not need the gene annotations. In this case, MTBseq will of course also not calculate amino acid exchanges.
Just for explanation, the additional files created are actually used for the reference mapping process.
best wishes, Thomas
Hi Thomas,
Cheers for the comment and explanation on this!
Hi,
I placed an M bovis fasta reference file in ref path and ran MTBseq with teh following command;
MTBseq --step TBfull --ref Mbovis_AF2122_97 --distance 5 --threads 1
The program ran until it errored with this;