Open guidopuccetti opened 9 months ago
Hi. Can you use the -P option to plot the μ statistic factors? It seems that the maf filtering is filtering out more than half your data so all factors must be affected. Best regards, Nikos A.
On Tue, Feb 20, 2024 at 8:24 AM Guido Puccetti @.***> wrote:
Hi!
I am running RAiSD analysis for the same vcf but with two different maf filtering:
600M Pop10_before_geo_cluster_005.vcf 1.3G Pop10_before_geo_cluster_001.vcf
This is the flags -f Overwrites existing run files under the same run ID. -M Indicates the missing-data handling strategy (0: discards SNP (default)) -y Provides the ploidy (integer value), in my case haploid -w Provides the window size (integer value). The default value is 50 (empirically determined). -c Provides the slack for the SFS edges to be used for the calculation of mu_SFS. The default value is 1 (singletons and S-1 snp class, where S is the sample size). -s Generates a separate report file per set. -I Provides the path to the input file, which can be either in ms or in vcf format.
!/bin/sh
for i in vcf do /data/guido/RAiSD/raisd-master/bin/release/RAiSD -n ${i%.recode.vcf}_run -f -y 1 -M 0 -w 50 -c 1 -s -I $i for n in $(seq 1 1 14) do echo $n grep -v "//" RAiSD_Report.${i%.recode.vcf}_run.$n | awk -v dt=${i%.recode.vcf} -v nu=${n} '{print $0"\t"nu"\t"dt}' >> RAiSD_Report.ALL_runs_filtered.txt done done
These are the results. When I plot μ statistic value in the two graphs they differ by 8 order of magnitude (look y axis). I believe it is due to low frequency alleles but I can wrap my head around to explain why.
Pic_Pop10_before_geo_cluster_001.png (view on web) https://github.com/alachins/raisd/assets/57522203/820d7ef0-ebc3-42e7-b0f4-ac6a5b5ba08b Pic_Pop10_before_geo_cluster_005.png (view on web) https://github.com/alachins/raisd/assets/57522203/e119f9c9-47cc-44ec-b01f-bd7e1d0991cd
Can you please comment on these results and the applicability of RAiSD with the two maf filtering?
Thanks! Have a nice day G.
— Reply to this email directly, view it on GitHub https://github.com/alachins/raisd/issues/48, or unsubscribe https://github.com/notifications/unsubscribe-auth/AALKWCQ5FZIMUPWS5QMQDXLYURFS7AVCNFSM6AAAAABDQTWF56VHI2DSMVQWIX3LMV43ASLTON2WKOZSGE2DGNZQGI4TOOI . You are receiving this because you are subscribed to this thread.Message ID: @.***>
-- Nikolaos Alachiotis
Hi!
I am running RAiSD analysis for the same vcf but with two different maf filtering:
600M Pop10_before_geo_cluster_005.vcf 1.3G Pop10_before_geo_cluster_001.vcf
This is the flags -f Overwrites existing run files under the same run ID. -M Indicates the missing-data handling strategy (0: discards SNP (default)) -y Provides the ploidy (integer value), in my case haploid -w Provides the window size (integer value). The default value is 50 (empirically determined). -c Provides the slack for the SFS edges to be used for the calculation of mu_SFS. The default value is 1 (singletons and S-1 snp class, where S is the sample size). -s Generates a separate report file per set. -I Provides the path to the input file, which can be either in ms or in vcf format.
These are the results. When I plot μ statistic value in the two graphs they differ by 8 order of magnitude (look y axis). I believe it is due to low frequency alleles but I can wrap my head around to explain why.
Can you please comment on these results and the applicability of RAiSD with the two maf filtering?
Thanks! Have a nice day G.