gtonkinhill / panaroo

An updated pipeline for pangenome investigation
MIT License
271 stars 33 forks source link

different species pangenome #130

Closed smb20200615 closed 3 years ago

smb20200615 commented 3 years ago

Hello,

My understanding is that panaroo must be run at a species level. I am interested in running fastGEAR which takes input from tools such as panaroo and roary and has the ability to detect interspecies recombination (https://mostowylab.com/news/fastgear). It uses the gene alignments outputted by panaroo/roary. Can panaroo be run with different species? If so, how will the parameters have to be changed?

Many thanks,

gtonkinhill commented 3 years ago

Hi,

Yes, it is possible to run Panaroo with different species however its accuracy may degrade as species become more divergent.

Panaroo relies heavily on synteny and thus if two species have genes arranged in very different ways or share very few genes Panaroo may struggle to cluster them accurately.

When running Panaroo on very diverse datasets it is often sensible to run it in its sensitive mode to avoid over filtering rare genes. This can be achieved by setting --clean-mode sensitive. You may also want to reduce the initial clustering identity to something like 90% using --threshold 0.9

smb20200615 commented 3 years ago

Thank you so much for clarifying. On another note, I am planning to change all of my roary pipelines to panaroo. I was also wondering if you have any recommendations for parameters for running panaroo with piggy. The recommended parameters I saw for piggy were roary -e -n -i 90 -s -z. I am not fully sure how to convert that for panaroo.

gtonkinhill commented 3 years ago

Hi,

I don't have a lot of experience running Piggy with Panaroo but am hoping to work with Harry Thorpe (the author of Piggy) soon on this.

Looking at the set of parameters you have given I think the main option to include in running panaroo would be

--merge_paralogs