alexcritschristoph / soil_popgen

Reproducible scripts and notebooks for 2019 paper on population genetics in metagenomes
GNU General Public License v3.0
14 stars 0 forks source link

matrix with frequency of major/minor allele #4

Closed palomo11 closed 4 years ago

palomo11 commented 5 years ago

Hi,

I'm wondering if there is any way to get the frequency of the minor (or major) allele of the core SNVs of a specific species across several samples based on the per-sample profiling. So the output for "MAG1" will be something like:

        Sample1           Sample2            Sample3
SNV1      0.0              0.2                  0.6
SNV2      0.1               1                   0.0
...
SNVn      0.25             0.3                  0.75 

Then some analysis as hierarchical clustering or PCA could be done to detect patterns of sub(species) similarity across the samples.

Thanks in advance!

alexcritschristoph commented 4 years ago

Apologies that I missed this question. This would be nice, and is possible with some python or R wrangling of the individual files. However, I don't have a script that does it automatically! We will try to include something like that when we finish the functional program based on this repository this year.