SionBayliss / PIRATE

A toolbox for pangenome analysis and threshold evaluation.
GNU General Public License v3.0
89 stars 29 forks source link

Extract shared genes of a set of genomes compare with the others #37

Closed igarrom closed 4 years ago

igarrom commented 4 years ago

I have run PIRATE and the execution was really fast and fine. My question now is if using some of the scripts I can extract shared genes between a set of genomes that are not within the rest of genomes. Is that possible? thank you!

SionBayliss commented 4 years ago

There are no scripts in PIRATE for explicitly looking at difference/intersection etc. You could take two simple approaches. Firstly, identify the genes present in both subsets of genomes you are interested in using PIRATE/tools/subsample_outputs and then use a 'comm' in BASH to find the intersection/difference etc. Alternatively, if you wish to account for population structure in your samples, run PIRATE/tools/convert_format/PIRATE_to_roary on your PIRATE output and pass that file to scoary (https://github.com/AdmiralenOla/Scoary). This will give you an association between each gene and a subset of isolates which has been corrected for population structure.

igarrom commented 4 years ago

Thank you very much! I will try both approaches!