ANGSD / angsd

Program for analysing NGS data.
228 stars 50 forks source link

Problem with -doThetas #558

Open MehrabChowdhury1 opened 1 year ago

MehrabChowdhury1 commented 1 year ago

Hi!

I'd like to calculate nucleotide diversity for different population. I install ANGSD by follow the following the instructions:

git clone https://github.com/samtools/htslib.git git clone https://github.com/ANGSD/angsd.git cd htslib;make;cd ../angsd ;make HTSSRC=../htslib;cd ..

For Nucleotide diversity I ran the following codes – My codes - for_ POP in LWK TSI do

     echo $POP
     $ANGSD/a_ngsd -P 4 -b $POP.bamlist -ref $REF -anc $ANC -out Results/$POP -GL 1 -doSaf 1 -minMapQ 20 -minQ 20 - 
     minInd 1 -setMinDepth 20 -setMaxDepth 200 -doCounts 1 -r tig00010453/
    -uniqueOnly 1 -remove_bads 1 -only_proper_pairs 1 -trim 0 -C 50 -baq 1 \
     -doThetas 1 -pest Results/$POP.sfs &> /dev/null

      Done

Then I got the following comments -

-> angsd version: 0.941-6-g67b6b -> angsd version: 0.941-6-g67b6b3b (htslib: 1.16-41-g826ceea) build(Jan -> /home/mbxmc7/Software/angsd/angsd -P 4 -b LWK.bamlist -ref /home/mbxmout Results/LWK -GL 1 -doSaf 1 -minMapQ 20 -minQ 20 -minInd 1 -setMinDepth 20 -ss 1 -trim 0 -C 50 -baq 1 -doThetas 1 -pest Results/LWK.sfs -> /home/mbxmc7/Software/angsd/angsd -P 4 -b TSI.bamlist -ref /home/mbxmout Results/TSI -GL 1 -doSaf 1 -minMapQ 20 -minQ 20 -minInd 1 -setMinDepth 20 -ss 1 -trim 0 -C 50 -baq 1 -doThetas 1 -pest Results/TSI.sfs -> Inputtype is BAM/CRAM -> Inputtype is BAM/CRAM [multiReader] 2 samples in 2 input files -> Reading fasta: /home/mbxmc7/reference/C981_assembly_v2.fasta [multiReader] 2 samples in 2 input files -> Reading fasta: /home/mbxmc7/reference/C981_assembly_v2.fasta -> Reading fasta: /home/mbxmc7/reference/C981_assembly_v2.fasta -> Reading fasta: /home/mbxmc7/reference/C981_assembly_v2.fasta 41 argument - ``

doThetas is unknown will exit 41 argument -doThetas is unknown will exit

Then I try to find out where is the problem. And when I ran -

$ ANGSD/angsd -doSaf -> angsd version: 0.935 (htslib: 1.12) build(Mar 28 2021 01:00:43) -> angsd -doSaf -> Analysis helpbox/synopsis information: -> Sat Feb 18 20:15:52 2023

abcSaf.cpp: -doSaf 0 1: perform multisample GL estimation 2: use an inbreeding version 3: calculate genotype probabilities (use -doPost 3 instead) 4: Assume genotype posteriors as input (still beta) -underFlowProtect 0 -anc (null) (ancestral fasta) -noTrans 0 (remove transitions) -pest (null) (prior SFS) -isHap 0 (is haploid beta!) -doPost 0 (doPost 3,used for accesing saf based variables) NB: If -pest is supplied in addition to -doSaf then the output will then be posterior probability of the sample allele frequency for each site.

It seems that “-doThetas” option is not available in my ANGSD package. According to the tutorial it should be like that-

$ANGSD/angsd -doSaf ... -doSaf 0 1: perform multisample GL estimation 2: use an inbreeding version 3: calculate genotype probabilities 4: Assume genotype posteriors as input (still beta) -doThetas 0 (calculate thetas) -underFlowProtect 0 -fold 0 (deprecated) -anc (null) (ancestral fasta) -noTrans 0 (remove transitions) -pest (null) (prior SFS) -isHap 0 (is haploid beta!) NB: If -pest is supplied in addition to -doSaf then the output will then be posterior probability of the sample allelefrequency for each site

To solve this problem, I remove the package and then again install ANGSD. But every time I get the same results. Could you please help me to solve this problem. or please tell me how can I install -doThetas in my ANGSD?

TeresaPegan commented 1 year ago

There is a new way to calculate thetas. See here: #336 Are you using the most up to date version of the program? The ANGSD documentation has a lot of problems, unfortunately, including sections that are out of date and typos that mean things are explained wrong. I would recommend searching this issues page whenever trying to code something in ANGSD, I have learned a lot about how to actually use the program by doing so. (I am a user, not a developer). Good luck!

MehrabChowdhury1 commented 1 year ago

Hi Terasa

Thanks a lot for the information. I am struggling with this issue for a long time. Issue 336 is very helpful. Hope now I can get the proper codes and calculate nucleotide diversity for my data. Thanks again

Mehrab

MehrabChowdhury1 commented 1 year ago

Hi Terasa

Now I am little bit confuse.. which codes I should used for Nucleotide diversity? Could you please help me ? And I am using angsd version: 0.935.