ANGSD / angsd

Program for analysing NGS data.
230 stars 51 forks source link

AD Not Being Written on .BCF. #408

Open g-pacheco opened 3 years ago

g-pacheco commented 3 years ago

Hello,

I am sorry -- as usual, I am not quite sure if this is really an issue. However, I have been trying to make ANGSD--v935 write the AD field on the .bcf file but I cannot get it to so do.

Here is my command (I have tried different variations):

angsd -nThreads 2 -ref GCA_007896545.1_ASM789654v1_genomic.Edited.fasta -bam ES-Article--AllSamples_NoKbraNoKgra_SITES_TEST.BAMlist -remove_bads 1 -uniqueOnly 1 -baq 1 -C 50 -minMapQ 30 -minQ 20 -minInd $((3*95/100)) -GL 1 -doPost 1 -doMajorMinor 1 -doMaf 1 -doBcf 1 --ignore-RG 0 -doGeno 1 -doCounts 1 -doGlf 2 -MinMaf 0.04 -SNP_pval 1e-6 -doPlink 2 -geno_minDepth 3 -setMaxDepth $((3*600)) -dumpCounts 2 -postCutoff 0.95 -out ES-Article--AllSamples_NoKbraNoKgra_WithWGSs_SNPs_TEST

The .bcf is written, but only with the following fields: GT:DP:GL:PL:GP. From what I can see here, the AD should have been written since I have -doCounts != than 0.

Please let me know if I am missing something, and I apologise in advance in case this is a false alarm.

Best regards, George.

Jungal10 commented 2 years ago

Just following up, did you find ay solution for this issue?

g-pacheco commented 2 years ago

No, I am afraid I never found a solution for this issue. I hope you will be luckier.

ANGSD commented 2 years ago

Hi Sorry for my lack of response on all these issues. I can for sure add the AD, but I was wondering if there are any difference between the AD and DP. Or phrased differently what would you expect the DP and AD to represent. In the code DP is the persample readdepth, after filtering.

g-pacheco commented 2 years ago

Hello @ANGSD,

No worries about the delay. I think I found a way around it back then. However, I think it would be a valid implementation because these numbers are different (if I am understanding this well).

DP: Read Depth. AD: Allelic depths for the major and minor alleles in the order listed.

For instance [created with ANGSD--v0.930]:

GT:DP:AD:GP:GL 0/1:21:10,11:0.000000,1.000000,0.000000:-9.315670,0.000000,-13.883410

It would be helpful to have these numbers [21 / 10, 11] independently. What do you think?

Thanks again, George.

ma-diroma commented 2 years ago

Hi,

I have a similar issue. In my BCF output file there are many lines in the header referring to the FORMAT field. I am interested in retrieving DPR=Number of high-quality bases observed for each allele, but it is not reported in the BCF file (only GT, DP, GL, PL and GP info are reported). How to add also DPR?

Thank you. Best wishes, Maria Angela