labsquare / cutevariant

A standalone and free application to explore genetics variations from VCF file
https://cutevariant.labsquare.org/
GNU General Public License v3.0
102 stars 21 forks source link

VQL for set operation #56

Closed dridk closed 4 years ago

dridk commented 5 years ago

It would be greate to have a grammar to create set . For instance, to select de novo mutation ( present only in child )

CREATE denovo =  child - ( fater & mother ) 

It will perform the following query :

 SELECT chr,pos FROM variant EXCEPT (SELECT chr,pos FROM father INTERSECT SELECT chr,pos FROM mother)
Aluriak commented 5 years ago

happy to see that you finally got my idea !

To handle operator chaining, you will have to define operator precedance and proper AST generation/compilation. Can be dull, maybe unwanted.

I have recipes for that ; take a look at this.

dridk commented 5 years ago

Some example of VQL :

Display variants

SELECT chr,pos FROM all WHERE genotype("sacha").is_hetero == True AND phenotype("sacha").blue_eye =True

Create selections

 CREATE setA = SELECT FROM all WHERE gene=CFTR ; 
 CREATE setB = all - (setA & setB)

DELETE selections

 DELETE setA 

CONTEXT parameters

SET truc=3
SET genome=hg19 

Exemple of scripts :

De Novo mutation selections :

CREATE mother = SELECT FROM all WHERE genotype("mother").isHomo OR genotype("mother").isHetero;  
CREATE father = SELECT FROM all WHERE genotype("father").isHomo OR genotype("mother").isHetero;  
CREATE child = SELECT FROM all WHERE genotype("child").isHetero; 
CREATE denovo = child - (mother & father) ; 

Recessive mutations

CREATE mother = SELECT FROM all WHERE genotype("mother").isHetero
CREATE father = SELECT FROM all WHERE genotype("father").isHetero
CREATE child = SELECT FROM all WHERE genotype("child").isHomo; 
CREATE recesive = child & (mother & father) ; 

Find mutation in huge family

CREATE malade = SELECT FROM all WHERE phenotype("*").blue_eye = True AND genotype(*).isHetero
dridk commented 5 years ago

Grammar proposition :

CREATE setA FROM variant WHERE pos = 3 
Model:
    Selection|Creation;

Selection:
    select=SelectClause from=FromClause where=WhereClause?
;
Creation:
    'CREATE' id=ID from=FromClause where=WhereClause?
;
SelectClause:
    'SELECT' columns*=Field[',']
;

FromClause:
    'FROM' table=ID
;
WhereClause:
    'WHERE' expression=Expr
;
Aluriak commented 5 years ago

Why not just use current VQL grammar and allow to use either SELECT or CREATE as first token ?

dridk commented 5 years ago

Yes.. I can factorize the grammar