brentp / vcfanno

annotate a VCF with other VCFs/BEDs/tabixed files
https://genomebiology.biomedcentral.com/articles/10.1186/s13059-016-0973-5
MIT License
365 stars 56 forks source link

Error with postannotation #105

Closed PedroBarbosa closed 5 years ago

PedroBarbosa commented 5 years ago

Dear @brentp ,

I want to join several fields into one, using the same procedure as shown in #63. However, I get the following error:

vcfanno version 0.2.8 [built with go1.8]

see: https://github.com/brentp/vcfanno
=============================================
panic: Near line 10 (last key parsed 'postannotation'): Expected a top-level item to end with a new line, comment or EOF, but got '|' instead.

goroutine 1 [running]:
main.main()
        /home/brentp/go/src/github.com/brentp/vcfanno/vcfanno.go:85 +0x19db

I know that some variants will have missing values from the annotation stage. Is that the reason? Also is it possible to set fields to 0 when they are missing from a record ?

lua.config:

function join(sep, ... )
        return table.concat({...}, tostring(sep))
end

conf file:

[[annotation]]
file="/home/pedro.barbosa/spliceai/vcfanno/whole_genome_filtered_spliceai_scores_withChr.vcf.gz"
fields=["SYMBOL","DS_AG","DS_AL","DS_DG","DS_DL","DP_AG","DP_AL","DP_DG","DP_DL"]
names=["gene_name","DS_AG","DS_AL","DS_DG","DS_DL","DP_AG","DP_AL","DP_DG","DP_DL"]
ops=["self","self","self","self","self","self","self","self","self"]

[[postannotation]]
name="SpliceAI"
fields=["gene_name","DS_AG"]
op="lua:join("|",gene_name, DS_AG)"
type="String"
Description="Format: SpliceAIv1.2.1 variant annotation. These include delta scores (DS) and delta positions (DP) for acceptor gain (AG), acceptor loss (AL), donor gain (DG), and donor loss (DL). Format: ALLELE|SYMBOL|DS_AG|DS_AL|DS_DG|DS_DL|DP_AG|DP_AL|DP_DG|DP_DL"

query sample:

#CHROM  POS     ID      REF     ALT     QUAL    FILTER  INFO
chr1    201332458       TNNT2:c.536C>T  G       A       .       .       ReMM=0.9890;ANNOVAR_DATE=2017-07-17;dpsi_max_tissue=0.4080;dpsi_zscore=1.031;ALLELE_END

command invoked:

 vcfanno -p 2 -lua custom.lua anno.conf test.vcf.gz

Another curiosity, is it possible to match allele in a non-vcf tabixed file ? Thanks in advance, Pedro

brentp commented 5 years ago

just change this op="lua:join("|",gene_name, DS_AG)" to op="lua:join('|',gene_name, DS_AG)"

Another curiosity, is it possible to match allele in a non-vcf tabixed file ?

yes, it just must have a header with "ref" and "alt" or "reference" and "alternate". And the tabix index must be created correctly so that vcfanno can find the chrom and position fields.

PedroBarbosa commented 5 years ago

Thank you for the quick reply.

just change this op="lua:join("|",gene_name, DS_AG)" to op="lua:join('|',gene_name, DS_AG)"

It seems this operation just concatenates the first two fields. I would like to join all the fields provided the ... argument. Is it that table.concat() is not able to achieve this task?

yes, it just must have a header with "ref" and "alt" or "reference" and "alternate". And the tabix index must be created correctly so that vcfanno can find the chrom and position fields.

Well, unfortunately, I'm not able to run with my test file. I added the header and tabixed the file (tabix -b2 -s1), but vcfanno throws the following error:

=============================================
vcfanno version 0.2.8 [built with go1.8]

see: https://github.com/brentp/vcfanno
=============================================
vcfanno.go:115: found 1 sources from 1 files
panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x14 pc=0x552f0e]

goroutine 5 [running]:
github.com/brentp/bix.New(0xc4200168c0, 0x41, 0xc42004af68, 0x1, 0x1, 0x0, 0x0, 0x0)
        /home/brentp/go/src/github.com/brentp/bix/bix.go:101 +0x3ae
github.com/brentp/vcfanno/api.(*Annotator).Setup.func1(0xc42011b150, 0x1, 0x1, 0xc42011b140, 0x0, 0xc4200168c0, 0x41)
        /home/brentp/go/src/github.com/brentp/vcfanno/api/api.go:735 +0x1f4
created by github.com/brentp/vcfanno/api.(*Annotator).Setup
        /home/brentp/go/src/github.com/brentp/vcfanno/api/api.go:745 +0x155

The file:

#chr    pos     ref     alt     gene    score
chr1    69091   A       C       ENSG00000186092 0.029
chr1    69091   A       G       ENSG00000186092 0.053
chr1    69091   A       T       ENSG00000186092 0.002
chr1    69092   T       A       ENSG00000186092 0.075

The config:

[[annotation]]
file="/mnt/nfs/lobo/IMM-NFS/ensembl_vep/custom_data/traP/traP_v2.txt.gz"
fields=["score"]
names=["traP"]
ops=["self"]

Thanks in advance, Pedro

brentp commented 5 years ago

fields is only used for VCFs. use, e.g. columns=[5] for tab delimited.

join takes a lua table, then a separator, so you probably want: join({gene_name, AS}, '|') but you may have written your own join function that I don't know about. Yes, you can also use table.concat.

PedroBarbosa commented 5 years ago

:( I'm sure it's something naive that is missing.

fields is only used for VCFs. use, e.g. columns=[5] for tab delimited.

[[annotation]]
file="/mnt/nfs/lobo/IMM-NFS/ensembl_vep/custom_data/traP/traP_v2.txt.gz"
columns=[6]
names=["traP"]
ops=["self"]

Error:

panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x14 pc=0x552f0e]

goroutine 5 [running]:
github.com/brentp/bix.New(0xc4200187d0, 0x41, 0xc420050f68, 0x1, 0x1, 0x0, 0x0, 0x0)
        /home/brentp/go/src/github.com/brentp/bix/bix.go:101 +0x3ae
github.com/brentp/vcfanno/api.(*Annotator).Setup.func1(0xc42011edd0, 0x1, 0x1, 0xc42011edc0, 0x0, 0xc4200187d0, 0x41)
        /home/brentp/go/src/github.com/brentp/vcfanno/api/api.go:735 +0x1f4
created by github.com/brentp/vcfanno/api.(*Annotator).Setup
        /home/brentp/go/src/github.com/brentp/vcfanno/api/api.go

join takes a lua table, then a separator, so you probably want: join({gene_name, AS}, '|') but you may have written your own join function that I don't know about. Yes, you can also use table.concat.

I tried to write some without success. Right now I'm using the join operation as you suggested:

[[annotation]]
file="/home/pedro.barbosa/spliceai/vcfanno/whole_genome_filtered_spliceai_scores_withChr.vcf.gz"
fields=["SYMBOL","DS_AG","DS_AL","DS_DG","DS_DL","DP_AG","DP_AL","DP_DG","DP_DL"]
names=["gene_name","DS_AG","DS_AL","DS_DG","DS_DL","DP_AG","DP_AL","DP_DG","DP_DL"]
ops=["self","self","self","self","self","self","self","self","self"]

[[postannotation]]
name="SpliceAI"
fields=["gene_name","DS_AG","DS_AL","DS_DG","DS_DL","DP_AG","DP_AL","DP_DG","DP_DL"]
op="lua:join({alt[1],gene_name, DS_AS, DS_AL, DS_DG, DS_DL, DP_AG, DP_AL, DP_DG, DP_DL}, '|')
type="String"

Still running into issues, I don't know where the newline char is added since the annotations come from the fields come from the [[annotation]] step.

panic: Near line 10 (last key parsed 'postannotation.op'): Strings cannot contain new lines.

goroutine 1 [running]:
main.main()
        /home/brentp/go/src/github.com/brentp/vcfanno/vcfanno.go:85 +0x19db
brentp commented 5 years ago

you need to close the " on your op= line.