Open rpetit3 opened 3 years ago
Dear @rpetit3. Here are additional suggestions:
Dear @rpetit3,
I am wondering abut use of Scoary (https://github.com/AdmiralenOla/Scoary) with bactopia and looking forward to see this as a bactopia tool.
Thanks, Sandeep
Dear @rpetit3:
bactopia can use mafft with varying options? https://mafft.cbrc.jp/alignment/software/manual/manual.html
bactopia can run PIRATE using the exactly same options as PIRATE? For example,
using -z
instead of --keep_all_files
using --pan-opt "--cd-low 98 --hsp-len 0 --flat 2"
instead of --cd_low 98 --hsp_len 0 --mcl_inflation 2
https://github.com/SionBayliss/PIRATE#usage
--pan-opt additional arguments to pass to pangenome_contruction
-z retain intermediate files [0 = none, 1 = retain pangenome
files (default - re-run using --pan-off), 2 = all]
https://bactopia.github.io/bactopia-tools/pirate/
--keep_all_files Retain all intermediate files
Btw. This link (https://doi.org/10.1128/mSystems.00190-20) was redirected to this page (https://journals.asm.org/journal/msystems).
Hi @haruosuz
Apologies for the delay in responding, I'm going to look into this. I'll also get the link fixed, thank you for pointing it out!
Robert
Hi @haruosuz, I've corrected the syntax to match PIRATE's (thank you for pointing that out!). It will be in the next version of Bactopia (v1.7.1)
@kusandeep - I'm going to work on adding SCOARY, thank you for suggesting it!
Dear @rpetit3:
bactopia can generate a core genome phylogeny using FastTree as well as IQ-TREE? A tree (.nwk) can be generated by FastTree from core_alignment.fasta.gz
as well as binary_presence_absence.fasta.gz
?
Supplying the IQ-TREE core genome phylogeny to Scoary with --newicktree bactopia-tools/pirate/core-genome/iqtree/core-genome.treefile
printed the following Error:
ete3.parser.newick.NewickError: Unexpected newick format '100/100:0.0616018898'
I can totally add support for FastTree in the next version.
Dear @rpetit3:
bactopia can use pan‐ and core-genome analysis tools such as GET_HOMOLOGUES/GET_PHYLOMARKERS (https://pubmed.ncbi.nlm.nih.gov/29765358/) as well as Roary/PIRATE?
I vote yes! At the moment I'm working on https://github.com/bactopia/bactopia/tree/dsl2 which will be the basis of v2, and once that's available it'll make add these tools and suggestions much easier.
Dear @rpetit3:
I wonder if bactopia can modify Prokka annotations (gene and product names) and/or PIRATE annotations (in bactopia-tools/pirate/core-genome/pirate/PIRATE.gene_families.tsv
)? For example, annotations with databases (e.g. MEGARes, VFDB, eggNOG) can be appended to (or be substituted for) the Prokka/PIRATE annotations in bactopia?
I think that's a great idea, and something we can plan for v2 (maybe not initial v2 release, but something to add)
I noticed that the documentation "pirate - Bactopia" (https://bactopia.github.io/bactopia-tools/pirate/) has changed a lot. Is there an archive of the documentation for the Bactopia version 1.X.X?
I think mkdocs material added versioned docs. I'll see what I can do about this
Suggestions provided by @haruosuz
References:
https://github.com/SionBayliss/PIRATE#input-format Input format PIRATE accepts GFF3 annotation files containing matching nucleotide sequence at the end of the file.
https://sanger-pathogens.github.io/Roary/ Input files Roary takes GFF3 files as input. They must contain the nucleotide sequence at the end of the file.
http://www.iqtree.org/doc/Frequently-Asked-Questions#how-does-iq-tree-treat-gapmissingambiguous-characters How does IQ-TREE treat gap/missing/ambiguous characters? Gaps (-) and missing characters (? or N for DNA alignments) are treated in the same way as unknown characters, which represent no information.
https://evolution.genetics.washington.edu/phylip/doc/consense.html Consense -- Consensus tree program
https://github.com/harry-thorpe/piggy Piggy is a tool for analysing the intergenic component of bacterial genomes. It is designed to be used in conjunction with Roary (https://github.com/sanger-pathogens/Roary).
The output folder produced by Roary is required as an input to Piggy (specified by --roary_dir).
https://github.com/AdmiralenOla/Scoary Scoary is designed to take the gene_presence_absence.csv file from Roary
LS-BSR input You can also use as input the pan-genome as called from Jason Sahl's program LS-BSR (Large-Scale Blast Score Ratio).