Open ramiroricardo opened 3 years ago
No problem! What part of the pipeline would you like to adjust the threshold? PIRATE only explicitly separates genes into core/accessory at certain steps, such as when it generates the core alignment and when it plots summary figures and tables. These steps could be tweaked to use a more meaningful threshold for your analysis and could most likely be run after your analysis so that it does not have to be repeated.
Hi Sion,
Thanks for your reply. I think it would be great to have such a threshold when the core alignment is generated. Though I think ideally, the same threshold would then be applied to the summary plots/tables to keep everything consistent.
I will label this as a enhancement for the next release.
In the mean time changing the outputs to support this is relatively simple. The gene alignments can be generated using the scripts in the PIRATE/scripts directory inside your PIRATE output directory:
alignment:
create_pangenome_alignment.pl --dosage 1.25 -t 99 -i PIRATE.gene_families.ordered.tsv -f ./feature_sequences/ -o core_alignment.fasta -g core_alignment.gff
Plots are a little more complicated. You will need to search and replace 95 with 99 inside the following script (open it in a text editor) and then run it using:
Rscript plot_summary.R ./
Hope that helps, Sion
Thanks a lot, will test this soon!
I look forward to the next release.
I would set roary -cd 100
to generate core_gene_alignment.aln for core genome phylogeny.
https://sanger-pathogens.github.io/Roary/
-cd FLOAT percentage of isolates a gene must be in to be core [99]
Hi Sion,
Is there a way to control the % of genomes that must have a gene for it to be considered core? From what I understood it is set at 95%, but thresholds like 99% are also common in the literature.
Thanks