alachins / raisd

RAiSD: software to detect positive selection based on multiple signatures of a selective sweep and SNP vectors
33 stars 13 forks source link

Same input vcf different maf #48

Open guidopuccetti opened 6 months ago

guidopuccetti commented 6 months ago

Hi!

I am running RAiSD analysis for the same vcf but with two different maf filtering:

600M Pop10_before_geo_cluster_005.vcf 1.3G Pop10_before_geo_cluster_001.vcf

This is the flags -f Overwrites existing run files under the same run ID. -M Indicates the missing-data handling strategy (0: discards SNP (default)) -y Provides the ploidy (integer value), in my case haploid -w Provides the window size (integer value). The default value is 50 (empirically determined). -c Provides the slack for the SFS edges to be used for the calculation of mu_SFS. The default value is 1 (singletons and S-1 snp class, where S is the sample size). -s Generates a separate report file per set. -I Provides the path to the input file, which can be either in ms or in vcf format.

#!/bin/sh
for i in *vcf*
do
/data/guido/RAiSD/raisd-master/bin/release/RAiSD -n ${i%.recode.vcf}_run -f -y 1 -M 0 -w 50 -c 1 -s -I $i
for n in $(seq 1 1 14)
do
echo $n
grep -v "//" RAiSD_Report.${i%.recode.vcf}_run.$n | awk -v dt=${i%.recode.vcf} -v nu=${n} '{print $0"\t"nu"\t"dt}' >> RAiSD_Report.ALL_runs_filtered.txt
done
done

These are the results. When I plot μ statistic value in the two graphs they differ by 8 order of magnitude (look y axis). I believe it is due to low frequency alleles but I can wrap my head around to explain why.

Pic_Pop10_before_geo_cluster_001 Pic_Pop10_before_geo_cluster_005

Can you please comment on these results and the applicability of RAiSD with the two maf filtering?

Thanks! Have a nice day G.

alachins commented 6 months ago

Hi. Can you use the -P option to plot the μ statistic factors? It seems that the maf filtering is filtering out more than half your data so all factors must be affected. Best regards, Nikos A.

On Tue, Feb 20, 2024 at 8:24 AM Guido Puccetti @.***> wrote:

Hi!

I am running RAiSD analysis for the same vcf but with two different maf filtering:

600M Pop10_before_geo_cluster_005.vcf 1.3G Pop10_before_geo_cluster_001.vcf

This is the flags -f Overwrites existing run files under the same run ID. -M Indicates the missing-data handling strategy (0: discards SNP (default)) -y Provides the ploidy (integer value), in my case haploid -w Provides the window size (integer value). The default value is 50 (empirically determined). -c Provides the slack for the SFS edges to be used for the calculation of mu_SFS. The default value is 1 (singletons and S-1 snp class, where S is the sample size). -s Generates a separate report file per set. -I Provides the path to the input file, which can be either in ms or in vcf format.

!/bin/sh

for i in vcf do /data/guido/RAiSD/raisd-master/bin/release/RAiSD -n ${i%.recode.vcf}_run -f -y 1 -M 0 -w 50 -c 1 -s -I $i for n in $(seq 1 1 14) do echo $n grep -v "//" RAiSD_Report.${i%.recode.vcf}_run.$n | awk -v dt=${i%.recode.vcf} -v nu=${n} '{print $0"\t"nu"\t"dt}' >> RAiSD_Report.ALL_runs_filtered.txt done done

These are the results. When I plot μ statistic value in the two graphs they differ by 8 order of magnitude (look y axis). I believe it is due to low frequency alleles but I can wrap my head around to explain why.

Pic_Pop10_before_geo_cluster_001.png (view on web) https://github.com/alachins/raisd/assets/57522203/820d7ef0-ebc3-42e7-b0f4-ac6a5b5ba08b Pic_Pop10_before_geo_cluster_005.png (view on web) https://github.com/alachins/raisd/assets/57522203/e119f9c9-47cc-44ec-b01f-bd7e1d0991cd

Can you please comment on these results and the applicability of RAiSD with the two maf filtering?

Thanks! Have a nice day G.

— Reply to this email directly, view it on GitHub https://github.com/alachins/raisd/issues/48, or unsubscribe https://github.com/notifications/unsubscribe-auth/AALKWCQ5FZIMUPWS5QMQDXLYURFS7AVCNFSM6AAAAABDQTWF56VHI2DSMVQWIX3LMV43ASLTON2WKOZSGE2DGNZQGI4TOOI . You are receiving this because you are subscribed to this thread.Message ID: @.***>

-- Nikolaos Alachiotis