lh3 / bgt

Flexible genotype query among 30,000+ samples whole-genome
MIT License
96 stars 10 forks source link

getting private variants from an individual #9

Closed pontikos closed 8 years ago

pontikos commented 8 years ago

I'm trying to get variants seen in only one sample X in the cohort.

I'm trying:

bgt view -s,X -f 'AC1>0&&AC<=2' -G cohort.bgt

It's not working. How can I achieve this?

lh3 commented 8 years ago

It does not work because you are querying overlapping samples. Here is a not-so-clean workaroud: you add the sample name as a new field. For example:

cp cohort.bgt.spl cohort.bgt.spl.bak
awk -v FS="\t" -v OFS="\t" '{print $0,"name:Z:"$1}' cohort.bgt.spl.bak > cohort.bgt.spl
bgt view -s,X -s 'name!="X"' -f 'AC1>0&&AC2==0' -G cohort.bgt

Here, sample group 1 consists of only "X"; sample group 2 consists of everything except "X". Your new cohort.bgt.spl should look like:

sample1  ...  name:Z:sample1
sample2  ...  name:Z:sample2

Then you can use -s 'name!="sample1"' etc to select samples.