Closed ptranvan closed 3 years ago
Could be that a lot of reads are getting filtered - DP
is the filtered read depth used for calling. You can check the fraction of filtered reads per variant site with the FRF
annotation (add --annotations FRF
to your command). You can also check the number of unfiltered reads assigned to a haplotype with the ADP
annotation, or get per-allele depths with AD
.
@dancooke thanks for your fast answer.
1) What are the criteria you used to filter reads ?
2) Can I combined the annotations in one command ? eg. --annotations FRF ADP AD
3) Do I have to add this parameter from the start and after the population calling ? ex:
octopus --threads 1 -R ref.fasta -I sample1.sorted.bam -o sample1.bcf --annotations FRF ADP AD
octopus --threads 1 -R ref.fasta -I sample2.sorted.bam -o sample2.bcf --annotations FRF ADP AD
octopus -R ref.fasta -I sample1.sorted.bam sample2.sorted.bam --disable-denovo-variant-discovery -c sample1.bcf sample2.bcf -o octopus.joint.bcf --annotations FRF ADP AD
What are the criteria you used to filter reads ?
By default (for v0.7.4) reads are filtered if:
You can see the full list of read pre-processing options here.
Can I combined the annotations in one command ? eg. --annotations FRF ADP AD
Yes.
Do I have to add this parameter from the start and after the population calling ?
You will need it in the final command. You can also put it in the single-sample commands but it isn't required and will make the intermediate files bigger.
Oh now I understand
By default (for v0.7.4) reads are filtered if:
A duplicate (marked or as determined using Octopus' built-in algorithm).
In fact I an working with single RADseq, so by definition it's full of duplicates and I didn't used MarkDuplicates after BWA because it's useless. I guess I can stay with the octopus default parameters then or does it affect something for the variant calling ?
I'm not too familiar with RADseq - generally duplicates introduce false positives due to correlated errors. For amplicon sequencing you allow duplicates otherwise you loose most of your coverage, but then you need more aggressive filtering to remove the amplification artefacts. If you'd like to retain duplicates then add the --allow-octopus-duplicates
option.
Hi @dancooke
So I have tried the --allow-octopus-duplicates
option but I don't understand why I have a genotype called for some individual even if the DP=0 ?
GT:GQ:DP:MQ:PS:PQ:AD:ADP:FRF:FT
1|1:24:0:0:1219012:75:.,0:0:0:AD,DP,ADP
Hi, thanks for your software.
I have multiples samples and I did a population calling. Example of commands for 2 samples:
octopus -R ref.fasta -I sample1.sorted.bam sample2.sorted.bam --disable-denovo-variant-discovery -c sample1.bcf sample2.bcf -o octopus.joint.bcf
The issue is that all the DP for each samples are very low (between 0 to 2) which is not the case with freebayes (don't pay attention to the genotype and number of samples -> this is not the same order and these are subsample to illustrate)
Example Otopus result:
Example Freebayes result
scf3 82264 . G A 1492.88 . AB=0.486111;ABP=3.37221;AC=34;AF=0.22973;AN=148;AO=111;CIGAR=1X;DP=526;DPB=526;DPRA=1.07105;EPP=244.044;EPPR=904.171;GTI=1;LEN=1;MEANALT=1;MQM=60;MQMR=60;NS=74;NUMALT=1;ODDS=0.356749;PAIRED=0;PAIREDR=0;PAO=0;PQA=0;PQR=0;PRO=0;QA=3983;QR=15177;RO=415;RPL=0;RPP=244.044;RPPR=904.171;RPR=111;RUN=1;SAF=0;SAP=244.044;SAR=111;SRF=0;SRP=904.171;SRR=415;TYPE=snp GT:DP:AD:RO:QR:AO:QA:GL 0/0:17:17,0:17:601:0:0:0,-5.11751,-54.416 0/1:10:6,4:6:222:4:148:-10.6717,0,-17.3278 0/0:15:15,0:15:545:0:0:0,-4.51545,-49.3856
Am I doing something wrong ?
Thanks fro your help.