Closed snashraf closed 8 years ago
In order to use the fields from 1kg, you first need to annotate with them so thaty are in your VCF. Something like:
[[annotation]]
file="ALL.wgs.phase3_shapeit2_mvncall_integrated_v5.20130502.genotypes.vcf.gz"
fields=["EAS_AF", "AMR_AF", "AFR_AF", "EUR_AF", "SAS_AF"]
names= ["1kgEAS_AF", "1kgAMR_AF", "1kgAFR_AF", "1kgEUR_AF", "1kgSAS_AF"]
ops=["self", "self", "self", "self", "self"]
[[postannotation]]
fields= ["1kgEAS_AF", "1kgAMR_AF", "1kgAFR_AF", "1kgEUR_AF", "1kgSAS_AF"]
name= "1G_MA"
op="max"
type="Float
[[annotation]] file="ALL.wgs.phase3_shapeit2_mvncall_integrated_v5.20130502.genotypes.vcf.gz" fields=["EAS_AF", "AMR_AF", "AFR_AF", "EUR_AF", "SAS_AF"] names= ["1kgEAS_AF", "1kgAMR_AF", "1kgAFR_AF", "1kgEUR_AF", "1kgSAS_AF"] ops=[ "self", "self", "self", "self", "self"]
[[postannotation]] fields =["1kgEAS_AF", "1kgAMR_AF", "1kgAFR_AF", "1kgEUR_AF", "1kgSAS_AF"] name= "1G_MA" op="max" type="Float"
and I am getting this error . :( :(
vcfanno.go:156: Info Error: 1kgSAS_AF not found in row >> this error may occur many times. reporting once here... panic: interface conversion: interface is []float32, not float32
goroutine 3265 [running]: panic(0x7cf260, 0xc829325d00)
can you post the command-line that you're using, and the full conf file
[[annotation]] file="/gpfs/data_jrnas1/ref_data/Hsapiens/GRCh37/variation/ExAC.r0.3.1.sites.vep.vcf.gz" fields=["AF"] names =["ExAC_AF"] ops=["self"]
[[annotation]] file="whole_genome_SNVs.tsv.gz" columns=[6] names= ["CADD"] ops=["mean"]
[[annotation]] file="/gpfs/data_jrnas1/ref_data/Hsapiens/1000g/ALL.wgs.phase3_shapeit2_mvncall_integrated_v5.20130502.genotypes.vcf.gz" fields=["EAS_AF", "AMR_AF", "AFR_AF", "EUR_AF", "SAS_AF"] names= ["1kgEAS_AF", "1kgAMR_AF", "1kgAFR_AF", "1kgEUR_AF", "1kgSAS_AF"] ops=[ "self", "self", "self", "self", "self"]
[[postannotation]] fields =["1kgEAS_AF", "1kgAMR_AF", "1kgAFR_AF", "1kgEUR_AF", "1kgSAS_AF"] name= "1G_MA" op="max" type="Float"
Here is commond line: vcfanno -p 4 conf.toml 01.vcf.gz
I am able to annotate with others filed from other sources and only getting problem with this max operation.
Thanks
can you show the full error message that you are seeing? I am able to run this without problems.
can you also do:
zgrep -m1 SAS_AF /gpfs/data_jrnas1/ref_data/Hsapiens/1000g/ALL.wgs.phase3_shapeit2_mvncall_integrated_v5.20130502.genotypes.vcf.gz
and let me know the results. Maybe your genotypes VCF doesn't have SAS_AF?
No, I have SAS_AF in my genotype VCF as well. [nsyed@hpclogin1 work]$ zgrep -m1 "SAS_AF" /gpfs/data_jrnas1/ref_data/Hsapiens/1000g/ALL.wgs.phase3_shapeit2_mvncall_integrated_v5.20130502.genotypes.vcf.gz
vcfanno.go:156: Info Error: 1kgAF not found in row >> this error may occur many times. reporting once here... panic: interface conversion: interface is []float32, not float32
goroutine 3267 [running]: panic(0x7cf260, 0xc825772800) /usr/local/src/go-git/src/runtime/panic.go:500 +0x18c github.com/brentp/vcfanno/api.asfloat32(0x7a3ac0, 0xc82bb7e000, 0x4f0000c82012f690) /usr/local/src/gocode/src/github.com/brentp/vcfanno/api/reducers.go:75 +0xac github.com/brentp/vcfanno/api.max(0xc8244ebfc0, 0x1, 0x2, 0x3, 0xc8201451a0) /usr/local/src/gocode/src/github.com/brentp/vcfanno/api/reducers.go:32 +0x55 github.com/brentp/vcfanno/api.(_Annotator).PostAnnotate(0xc8201271c0, 0xc8236084ce, 0x1, 0x3ba9, 0x3baa, 0xa0f240, 0xc82360a460, 0x0, 0x0, 0x0, ...) /usr/local/src/gocode/src/github.com/brentp/vcfanno/api/api.go:482 +0x1584 github.com/brentp/vcfanno/api.(_Annotator).AnnotateEnds(0xc8201271c0, 0xa0f8a0, 0xc8203fbf80, 0x0, 0x0, 0x0, 0x0) /usr/local/src/gocode/src/github.com/brentp/vcfanno/api/api.go:583 +0xf33 main.main.func1(0xa0f8a0, 0xc8203fbf80) /usr/local/src/gocode/src/github.com/brentp/vcfanno/vcfanno.go:151 +0x61 github.com/brentp/irelate.PIRelate.func1.1(0xc821110000, 0x190, 0x190, 0xc82221a2d0, 0xc825c76120) /usr/local/src/gocode/src/github.com/brentp/irelate/parallel.go:201 +0x4c created by github.com/brentp/irelate.PIRelate.func1 /usr/local/src/gocode/src/github.com/brentp/irelate/parallel.go:205 +0x7b
so this is a new error, yes? where is "1kgAF" coming from? I don't see that in the conf above.
Here you go,
vcfanno.go:156: Info Error: 1kgSAS_AF not found in row >> this error may occur many times. reporting once here... panic: interface conversion: interface is []float32, not float32
goroutine 3319 [running]: panic(0x7cf260, 0xc8257928c0) /usr/local/src/go-git/src/runtime/panic.go:500 +0x18c github.com/brentp/vcfanno/api.asfloat32(0x7a3ac0, 0xc8217016e0, 0x4c0000c8201436b0) /usr/local/src/gocode/src/github.com/brentp/vcfanno/api/reducers.go:75 +0xac github.com/brentp/vcfanno/api.max(0xc82502b000, 0x5, 0x8, 0x3, 0xc82014f340) /usr/local/src/gocode/src/github.com/brentp/vcfanno/api/reducers.go:32 +0x55 github.com/brentp/vcfanno/api.(_Annotator).PostAnnotate(0xc8201392c0, 0xc82315d9ec, 0x1, 0x2a0ae0, 0x2a0ae1, 0xa0f240, 0xc823494180, 0x0, 0x0, 0x0, ...) /usr/local/src/gocode/src/github.com/brentp/vcfanno/api/api.go:482 +0x1584 github.com/brentp/vcfanno/api.(_Annotator).AnnotateEnds(0xc8201392c0, 0xa0f8a0, 0xc8230f5890, 0x0, 0x0, 0x0, 0x0) /usr/local/src/gocode/src/github.com/brentp/vcfanno/api/api.go:583 +0xf33 main.main.func1(0xa0f8a0, 0xc8230f5890) /usr/local/src/gocode/src/github.com/brentp/vcfanno/vcfanno.go:151 +0x61 github.com/brentp/irelate.PIRelate.func1.1(0xc824378000, 0x190, 0x190, 0xc821190750, 0xc826c93c80) /usr/local/src/gocode/src/github.com/brentp/irelate/parallel.go:201 +0x4c created by github.com/brentp/irelate.PIRelate.func1 /usr/local/src/gocode/src/github.com/brentp/irelate/parallel.go:205 +0x7b
I am getting same error for max operations on other values as well.
This is because there are sites with multiple alternate alleles so then it looks like:
SAS_AF=0.3497,0.6472
you can use vt decompose
to get around this for now. I'm not sure how best to handle that inside of vcfanno.
You can also write your own lua function that checks if it got an array ( a lua table) or just a value.
actually, I'm working on a fix for this so that for the case of "max", we can collapse cases like this...
I have made a change so that for max() and min() it will do the expected thing even if there are multiple values for one of the overlapping sites. If you're on linux, you can try the new binary here: http://home.chpc.utah.edu/~u6000771/vcfanno_0.0.11_beta4
you may have to chmod +x it. Thanks for reporting.
Let me try now and will let you know. Thanks
Its working now . Many Thanks !!
Hi Brent !!
I am performing the max operation as below.
[[annotation]] file="/gpfs/data_jrnas1/public_data/genomeAsia100k/100kAsian.annot.cont_withmaf.vcf.gz" fields=["AF","AF_OCE","AF_NEA", "AF_SEA", "AF_SAS", "AF_AFR", "AF_AMR", "AF_WER"] names=["1000k_asia_AF","1000k_asia_AF_OCE", "1000k_asia_AF_NEA","1000k_asia_AF_SEA","1000k_asia_AF_SAS","1000k_asia_AF_AFR","1000k_asia_AF_AMR","1000k_asia_AF_WER"] ops=["self","self", "self", "self", "self", "self","self", "self"]
[[postannotation]] fields= ["1000k_asia_AF","1000k_asia_AF_OCE", "1000k_asia_AF_NEA","1000k_asia_AF_SEA","1000k_asia_AF_SAS","1000k_asia_AF_AFR","1000k_asia_AF_AMR","1000k_asia_AF_WER"] name= "1000k_asia_MAX" op="max" type="String"
I am getting output file as :
I want "Number=A" not the "Number=. " in the VCF header?
thanks for your help.
Regards, Najeeb
I think for string it defaults to ".". Why not make it a Float? you can do this by using name="1000k_asia_MAX_float"
.
[[postannotation]] fields=["af_exac_all", "af_exac_afr", "af_exac_amr", "af_exac_eas", "af_exac_nfe", "af_exac_oth", "af_exac_sas"] op="max" name="max_aaf_all" type="Float"
I am trying to do similar operation with 1K Frequency like file="ALL.wgs.phase3_shapeit2_mvncall_integrated_v5.20130502.genotypes.vcf.gz" [[postannotation]] fields=["EAS_AF", "AMR_AF", "AFR_AF", "EUR_AF", "SAS_AF"] name= "1G_MA" op="max" type="Float
But some how I am getting any results . :vcfanno.go:156: Info Error: SAS_AF not found in header >> this error may occur many times. reporting once here...
Seems like for records SAS_AF is not available and hence not reporting any data. Could you please help on this.