Open bopohdr opened 5 years ago
Doesn't GLnexus already support this? You can do a DP filtering by specifying the genotyper_config::required_dp
(https://github.com/dnanexus-rnd/GLnexus/blob/master/include/types.h#L702)
Doesn't GLnexus already support this? You can do a DP filtering by specifying the
genotyper_config::required_dp
(https://github.com/dnanexus-rnd/GLnexus/blob/master/include/types.h#L702)
I want to use online version deposited at DNAnexus and to my knowledge there is no such option to change the DP available
My apologies for missing this and the earlier question. Yes, currently the DNAnexus applet lets you choose from among the preset configurations, but lacks the capability to take in a custom YML configuration file as glnexus_cli
can from the command-line. This is easy to add and I'll do so shortly.
@bopohdr I have added the necessary dxapplet input in v1.1.5 which you can find in the GLnexus_Getting_Started project.
Here's an example yml file with a version of the DeepVariant configuration plus genotyper_config.required_dp = 1. This will cause GLnexus to no-call cells with zero supporting reads even if the input gVCF says 0/0.
Upload that yml to your project and then you can run it like so
dx run GLnexus -i config_yml=DeepVariant_DP1.yml -i bed_ranges_to_genotype=hg19_chr21.bed -i output_name=dv_platinum6_chr21 -i gvcf=dv_platinum6_chr21_gvcf/NA12877.chr21.gvcf.gz -i gvcf=dv_platinum6_chr21_gvcf/NA12878.chr21.gvcf.gz -i gvcf=dv_platinum6_chr21_gvcf/NA12891.chr21.gvcf.gz -i gvcf=dv_platinum6_chr21_gvcf/NA12890.chr21.gvcf.gz -i gvcf=dv_platinum6_chr21_gvcf/NA12889.chr21.gvcf.gz -i gvcf=dv_platinum6_chr21_gvcf/NA12892.chr21.gvcf.gz -y --watch
GLnexus echoes the active configuration YAML into the job log so you can double-check the required_dp setting.
HTH & I'm sorry again for missing the inquiry before -- too many gmail accounts these days :(
@bopohdr I have added the necessary dxapplet input in v1.1.5 which you can find in the GLnexus_Getting_Started project.
Here's an example yml file with a version of the DeepVariant configuration plus genotyper_config.required_dp = 1. This will cause GLnexus to no-call cells with zero supporting reads even if the input gVCF says 0/0.
Upload that yml to your project and then you can run it like so
dx run GLnexus -i config_yml=DeepVariant_DP1.yml -i bed_ranges_to_genotype=hg19_chr21.bed -i output_name=dv_platinum6_chr21 -i gvcf=dv_platinum6_chr21_gvcf/NA12877.chr21.gvcf.gz -i gvcf=dv_platinum6_chr21_gvcf/NA12878.chr21.gvcf.gz -i gvcf=dv_platinum6_chr21_gvcf/NA12891.chr21.gvcf.gz -i gvcf=dv_platinum6_chr21_gvcf/NA12890.chr21.gvcf.gz -i gvcf=dv_platinum6_chr21_gvcf/NA12889.chr21.gvcf.gz -i gvcf=dv_platinum6_chr21_gvcf/NA12892.chr21.gvcf.gz -y --watch
GLnexus echoes the active configuration YAML into the job log so you can double-check the required_dp setting.
HTH & I'm sorry again for missing the inquiry before -- too many gmail accounts these days :(
Thank You very much ! On applet at DNAnexus.com it worked good. in the .yml file You provided I changed only DP settings. However, I think GLnexus still counts those with DP below threshold for AF calculation:
chr1 |
1708855 | chr1_1708855_TTCCTCCTCC_T | TTCCTCCTCC | T | 41 | . | AF=0.25;AQ=41 | GT:DP:AD:GQ:PL:RNC | ./.:3:3,0:9:.:.. | 0/0:14:3,11:8:0,30,6:.. | 0/0:73:73,0:50:.:.. | 1/1:18:0,18:40:41,48,0:.. |
---|---|---|---|---|---|---|---|---|---|---|---|---|
chr1 | 1709369 | chr1_1709369_C_T | C | T | 26 | . | AF=0.125;AQ=26 | GT:DP:AD:GQ:PL:RNC | ./.:.:.:.:.:MM | ./.:.:.:.:.:MM | 0/1:14:6,8:26:26,0,36:.. | ./.:0:0,0:1:.:.. `` |
e.g. the second variant is called only in one sample as a heterozygous genotype, although AF is 0.125.
Is this intended usage or it could be fixed somehow?
@bopohdr Yea the AF INFO field is just copy number divided by 2N.
Sorry if this isn't what you want...it has come up in discussions before and I'm open to feedback on whether the denominator should subtract non-calls. My argument is that both of these are flawed estimators with undesirable corner cases, so I prefer the simpler one. If one is really concerned with the best possible estimator then it should use the likelihood function on the het/hom genotype counts and a prior distribution on the allele frequency.
@mlin Thank You. I understand that there are flaws in both approaches of frequency calculations. As I understand it is not possible to set option for calculating frequency without non-calls in ".yml" file, which is provided for online app ? About feedback: maybe it also would be useful to have output option similar to mutation frequency in gnomad (1543 out of 245490 (that were covered at the position)).
I am trying to use GLnexus on DNAnexus platform for gvcfs produced by DeepVariant. For positions that was not covered (or by some reads only), the genotype is 0/0 - which assumes it is homozygous and is used for AF calculation. Example: `
CHROM | POS | ID | REF | ALT | QUAL | FILTER | INFO | FORMAT | ZL556| ZL553
-- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- chr1 | 1020217 | chr1_1020217_G_T | G | T | 0 | . | AF=0.25 | GT:DP:AD:GQ:PL:RNC | 0/0:0:0,0:1:.:.. | 0/1:3:1,2:3:0,0,26:.. chr1 | 1041134 | chr1_1041134_C_A | C | A | 33 | . | AF=0.5;AQ=33 | GT:DP:AD:GQ:PL:RNC | 0/1:4:0,4:28:33,0,29:.. | 0/1:8:1,7:16:17,0,21:.. ` First case has AF=0.25 while sample ZL556 do not have any reads at this positions.
For sentieon it was at least ./. or similar.
Is it possible to add/change/improve something in DNAnexus version to take low DP or 0 DP in account, when calculating AF and assigning genotypes ?