Add prioritization in SQL format

antonylebechec commented 2 months ago

To improve prioritization, prioritization profiles could include SQL string, instead of column filter definition. This will allow calculations of scores, flag... using multiple annotations and conditions. Also, adding PZClass in order to classify variant depending on prioritization criteria.

[x] SQL syntax
[x] PZClass

antonylebechec commented 2 months ago

New prioritization profiles format (old format compatible), including SQL syntax, INFO/PZClass and extra sections (e.g. "_description"):


{
    "default": {
        "_description": "Default prioritization profile",
        "_version": "1.0.0",
        "DP": [
            {
                "type": "gte",
                "value": "50",
                "fields": ["DP"],
                "score": 5,
                "flag": "PASS",
                "comment": [
                    "DP higher than 50"
                ]
            },
            {
                "type": "lt",
                "value": "50",
                "fields": ["DP"],
                "score": 0,
                "flag": "FILTERED",
                "comment": [
                    "DP lower than 50"
                ]
            }
        ],
        "CLNSIG": [
            {
                "type": "equals",
                "value": "pathogenic",
                "fields": ["CLNSIG"],
                "score": 15,
                "flag": "PASS",
                "comment": [
                    "Described on CLINVAR database as pathogenic"
                ]
            },
            {
                "type": "equals",
                "value": "non-pathogenic",
                "fields": ["CLNSIG"],
                "score": -100,
                "flag": "FILTERED",
                "comment": [
                    "Described on CLINVAR database as non-pathogenic"
                ]
            }
        ],
        "Class": [
            {
                "sql": " DP >= 100 OR regexp_matches(CLNSIG, 'Pathogenic') ",
                "fields": ["DP", "CLNSIG"],
                "score": 100,
                "flag": "PASS",
                "class": "PM1,PM2",
                "comment": [
                    "Described on CLINVAR database as pathogenic, classified as PM1 and PM2"
                ]
            },
            {
                "sql": ["DP >= 200", "OR regexp_matches(CLNSIG, 'Pathogenic')"],
                "fields": ["DP", "CLNSIG"],
                "score": 200,
                "flag": "PASS",
                "class": ["PM1", "PM2"],
                "comment": [
                    "Described on CLINVAR database as non-pathogenic, classified as PM1 and PM3"
                ]
            }
        ]
    }
}

antonylebechec commented 2 months ago

done.

bioinfo-chru-strasbourg / howard

Add prioritization in SQL format #265