Simple summary of genomes with positive selection

mchaisso commented 2 years ago

I'm setting this to run in a genome-wide scan with 7 species and 13,000 genes (13k separate runs). Fairly naive question: how do I get a simple report of the genomes where positive selection is detected?

hoelzer commented 2 years ago

Hi @mchaisso , sorry, this is not what PoSeiDon was designed for. The target of the pipeline is to check a multiple sequence alignment (MSA) of homologous gene sequences for positive selection and visualize that. However, for me, it sounds that you would like to run the pipeline 13k times on MSAs for the genes of these 7 species?

I would say, after you manage that you could write some script to extract information about the sites under significant positive selection from the 13k result directories to then project the identified sites on some visualization of your genomes. Would need some coordinate calculations then as well, to project the correct position from one site under positive selection in a certain gene onto the genome again.

I will close this, bc/ that's not really the purpose of the pipeline. But please feel free to write if there are further questions.

mchaisso commented 2 years ago

Thanks. I'm new to software for positive selection and was looking for canned analysis. By genome wide I mean all genes in a genome, rather than positions, so I'm making 13k multi-fasta sequences of orthologous genes. If there is an easier way to do this, I'd be happy to give it a try, I'm new to the positive selection task. Will try and get this pipeline running anyway since the visualizations are nice.

hoelzer commented 2 years ago

Yes, I understand. So what you want to achieve is - given reference genomes with annotated genes - to figure out which of these are under higher (positive) selective pressure. Basically, this could be a first step before the "per-base" analyses that can be performed with PoSeiDon. We just started to look into that topic but right now, I can unfortunately not recommend other tools that would be good for the "genome-lvl" taks

rnajena / poseidon

Simple summary of genomes with positive selection #32