ANGSD / angsd

Program for analysing NGS data.
230 stars 50 forks source link

Number of numbers output by realSFS #619

Open Erikacj opened 6 months ago

Erikacj commented 6 months ago

I am running -doSaf and realSFS to obtain the SFS to estimate Ne through time. For 14 samples I run:

~/angsd/angsd \ -bam bam.txt \ -nThreads 8 \ -anc ~/ref.fasta \ -setMinDepthInd 5 \ -doSaf 1 \ -minmapQ 30 \ -minQ 20 \ -minInd 7 \ -doBcf 1 \ -domajorminor 1 \ -gl 2 \ -dopost 1 \ -domaf 2 \ -out ./angsd/SFS

~/angsd/misc/realSFS SFS.saf.idx -maxIter 100 -P 4 > SFS.sfs

The output from the above gives me 29 numbers for 14 samples:

2553445.800865 103809.213969 43688.859571 28674.860202 21990.861158 17054.914529 14721.903145 12352.277563 11707.233481 10945.865687 11070.809709 10355.972034 10011.312219 9803.119318 9387.831696 5617.122029 4253.325799 2615.508075 2468.261782 1593.288319 1148.791966 1317.130954 1039.779926 1126.637310 1217.816215 1393.024173 1699.225106 2439.155779 18558.097422

I thought this should give me 28 numbers, not 29. Can someone explain why I am getting this number?

ANGSD commented 6 months ago

It is the number of derived alleles. If you have n diploid samples. then you have 2n+1. The first entry is the frequency of having zero derived alleles. Second entry is the frequency of having one derived alleles. The last entry, which will be 2n+1 is the frequency of having 2n derived alleles. The first and the last category, are the invariable category.

On 24 Apr 2024, at 02.10, Erikacj @.***> wrote:

I am running -doSaf and realSFS to obtain the SFS to estimate Ne through time. For 14 samples I run:

~/angsd/angsd -bam bam.txt -nThreads 8 -anc ~/ref.fasta -setMinDepthInd 5 -doSaf 1 -minmapQ 30 -minQ 20 -minInd 7 -doBcf 1 -domajorminor 1 -gl 2 -dopost 1 -domaf 2 -out ./angsd/SFS

~/angsd/misc/realSFS SFS.saf.idx -maxIter 100 -P 4 > SFS.sfs

The output from the above gives me 29 numbers for 14 samples:

2553445.800865 103809.213969 43688.859571 28674.860202 21990.861158 17054.914529 14721.903145 12352.277563 11707.233481 10945.865687 11070.809709 10355.972034 10011.312219 9803.119318 9387.831696 5617.122029 4253.325799 2615.508075 2468.261782 1593.288319 1148.791966 1317.130954 1039.779926 1126.637310 1217.816215 1393.024173 1699.225106 2439.155779 18558.097422

I thought this should give me 28 numbers, not 29. Can someone explain why I am getting this number?

— Reply to this email directly, view it on GitHub https://github.com/ANGSD/angsd/issues/619, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABQOR3SP5JWSRTA7CUAXGQTY63Z7PAVCNFSM6AAAAABGV35Y6OVHI2DSMVQWIX3LMV43ASLTON2WKOZSGI2TSOJXHEZDOOI. You are receiving this because you are subscribed to this thread.