brentp / vcfanno

annotate a VCF with other VCFs/BEDs/tabixed files
https://genomebiology.biomedcentral.com/articles/10.1186/s13059-016-0973-5
MIT License
364 stars 56 forks source link

Extract specific field from one column in BED #77

Closed maggie-fu closed 7 years ago

maggie-fu commented 7 years ago

Hi @brentp,

I am wondering if it is possible to extraction one field from a column of BED, if the column contain multiple fields. For example, I have a BED file that is formatted like this:

chr1 812997 812998 ID=gssvG1;Name=gssvG1;variant_type=CNV;variant_sub_type= Gain;outer_start=812998;inner_start=837847;inner_end=1477469;outer_end=1708649;i nner_rank=10;num_variants=3;variants=nssv24621,nssv1423341,nssv1440790;num_studi es=2;Studies=Perry2008,Park2010;num_platforms=2;Platforms=AgilentCustom_015685+0 15686+244K,Agilent24M;number_of_algorithms=1;algorithms=ADM2;num_samples=3;sampl es=NA18968,NA18969,NA19221;Frequency=5.45%;PopulationSummary=African

Everything after "ID" belongs to the fourth column. Is there any way to annotate my query with only the "Frequency" field, for example?

brentp commented 7 years ago

this is not possible in vcfanno. you should instead convert those to columns before-hand and then you can use the existing machinery in vcfanno.