Open ccbruels opened 1 week ago
The problem is somewhat confusing as it is stated: you say want to filter in intergenic regions but the example you gave seems unrelated. Instead, it seems the variant is in two overlapping genes (here FAM138A and OR4F5) and the problem is that matching by gene name does not work for these records. So I am unsure what is it you want?
Perhaps I picked a bad example, that variant was tagged as intergenic by annovar but I did not look at it in a genome browser.
Looking at another clearly intergenic variant, here is the annovar vcf output chr1 3439841 . A C 31.76 PASS P;ANNOVAR_DATE=2020-06-08;Func.refGene=intergenic;Gene.refGene=PRDM16\x3bARHGEF16;GeneDetail.refGene=dist\x3d1220\x3bdist\x3d14824;ExonicFunc.refGene=.;AAChange.refGene=.;Xref.refGene=.;avsnp151=rs2483250;gnomad41_genome_AF=0.8166;gnomad41_genome_AF_raw=0.8160;CLNSIG=.
My question is: how would I filter for this variant if I am looking for variants flagged as intergenic, but specifically variants that might affect ARHGEF16? I have a very large list of genes, and it would be difficult to correctly list all of the possible variations if I want to find intergenic variants near it.
Hi,
I see how to filter a gene list for most snv/indels in issue Filter a gene list #1964.
However, I want to look at intergenic variants as well. Annovar includes other info in the Gene.refGene field like Gene.refGene=FAM138A\x3bOR4F5
If my gene.txt file only contains FAM138A, the intergenic variants are not included.
I'm using bcftools v1.21. My command is in the format bcftools view -i 'Gene.refGene=@genes.txt ' file.vcf
Including wildcards in the command or in the genes.txt file didn't work.
Do you have any suggestions?