Open davidmaimoun opened 1 year ago
Hello @davidmaimoun,
Sorry for the delay, and thank you for your interest in chewBBACA. Based on the name of the file you've described, cgMLST.tsv
, I assume that you performed allele calling with the AlleleCall
module and that you determined the core-genome based on the allele calling results with the ExtractCgMLST
module. The cgMLST.tsv
file contains the allelic profiles of your samples (each row is a strain, and each column is a locus/gene that is present in at least --t
strains, where --t
is the loci presence value you passed to the ExtractCgMLST
module, or the default of [0.95, 0.99, 1] if you did not pass any value). The allelic profiles tell you which alleles were found in your strains. You can find more information about the output files created by the AlleleCall
and the ExtractCgMLST
modules here and here. The cgMLST.tsv
file has the same file structure as the results_alleles.tsv
file created by the AlleleCall
module, with the difference that it only includes the results for the loci in the core-genome.
You can upload the files with the allelic profiles to GrapeTree or to PHYLOViZ to visualise a Minimum Spanning Tree (MST) and perform various dataset operations that allow you to explore and analyse the results (more information about uploading chewBBACA results to PHYLOViZ here). The values displayed in the MST branches correspond to the distance between the strains (the number of allelic differences based on all compared loci). The allelic distances are computed based on the allelic profiles (it computes a distance matrix with the number of allelic differences for each pair of strains).
I hope that I could help with my explanation. Feel free to let me know if there is anything else you would like to know.
Kind regards,
Rafael
Hi @rfm-targa
I write here, as the title of this issue can include my question.
I would like to include Chewbbaca in my analysis pipeline in complement of another tool that is cgMLSTFinder (from CGE).
With cgMLSTFinder, I used to get the complex type of the bacterial strain, and unfortunatelly, I can not find in Chewbbaca doc the way to retrieve the complexe type from chewbbaca analysis. I ran chewBBACA.py PrepExternalSchema
to adapt Enterobase scheme, then I ran chewBBACA.py AlleleCall
. Output from the last module do not display complexe type.
What am I missing ? Regards, Alexandre
@jacarrico @aplf
Hello @alexandreflageul,
Thank you for your interest in chewBBACA. chewBBACA does not assign CTs to bacterial strains. The main output of the AlleleCall module is the file containing the allelic profiles, results_alleles.tsv
. The allelic profiles contained in this file can serve as the basis for subsequent analysis. You can import that file and sample metadata to PHYLOViZ to visualize an MST and explore the results through several dataset operations. If you want to cluster your samples to identify meaningful clustering levels and define CTs, I'd recommend ReporTree or HierCC. It might also be worth looking up the more recent concept of LIN codes.
Let us know if there's anything else.
Kind regards,
Rafael
Hello! I'm new in the field and I need to use Chewbbaca In the end of the analysis, I get in a visualization folder, a file, cgMLST.tsv. Is the values in this file represent the allele distance of each specie from the schema alleles? When I run it with grapetree, I get branches with values Can you explain to me what are these values?
Thank you