epam / NGB

New Genome Browser (NGB) - a Web - based NGS data viewer with unique Structural Variations (SVs) visualization capabilities, high performance, scalability, and cloud data support
MIT License
160 stars 41 forks source link

Ability to highlight "variants of interest" #391

Open NShaforostov opened 3 years ago

NShaforostov commented 3 years ago

Background

VCF files may include a large number of variations and it would be helpful to highlight variants of interest based on INFO field values. NGB shall provide an option to configure set of filters (highlight profiles) and highlight variants matching filter in Variation table and VCF track.

Approach

API:

Add the ability to read the conditions that define "variants of interest" from the JSON-file (e.g. interest_profiles.json). Such file will be placed on the server side. Content from that file should be read every time on the data update (to provide a hot swap of the conditions if needs). This file should contain a list of profiles. Each of these profiles shall include an own set of conditions by which "variants of interest" can be defined.

The format of that file should be the following:

{
  "<profile_name1>" : {
    "is_default" : "<is_default_value>",
    "conditions" : [
      {
        "condition" : "<condition_set1>",
        "highlight_color" : "<highlight_color1>"
      },
      {
        "condition" : "<condition_set2>",
        "highlight_color" : "<highlight_color2>"
      },
      ...
    ]
  },
  "<profile_name2>" : {
    "conditions" : [ ... ]
  },
  ...
}

Where:

Each <condition_set> should have a structure:

(<id1> <comparison_operator1> <value1>) <logic_operator1> (<id2> <comparison_operator2> <value2>) ...

Where:

There should be the ability to make up more complex sets from several comparisons by rules of Boolean algebra, using additional brackets, e.g.: ((<comparison1>) or (<comparison2>)) and ((<comparison3>) or (<comparison4>)).

Example of the profile:

{
  "two_alleles_high_mapping_quality" : {
    "is_default" : true,
    "conditions" : [
      {
        "highlight_color" : "ffbdbd",
        "condition" : "('ac' == '2') and ('mq' >= '80')"
      }
    ]
  }
}

By the example above, the default profile was described that has the following properties - variants of interest will be highlighted in color #FFBDBD if they match both conditions: allele count equals 2 and mapping quality greater than or equals to 80.


GUI:

Users shall have the ability to enable/disable highlighting of the "variants of interests". For that, the corresponding block shall be added into the VCF tab of the main settings, e.g.: image

When the corresponding checkbox is enabled, the dropdown list with all profiles shall appear, e.g.: image

In this dropdown list:

When user selects the profile and saves changes:

rodichenko commented 3 years ago

@NShaforostov @mzueva

API: Add the ability to read the conditions that define "variants of interest" from the JSON-file (e.g. interest_conditions.json). Such file will be placed on the server side

When the functionality is enabled:

  • at the VARIANTS panel, rows background of those variants that matches "interest" conditions, should be highlighted in the configured color, e.g.:

how these conditions will be calculated - on the server-side (by providing a calculated result as a property of variant, e.g. highlighted=true) or on the client?

TBD for client-side calculation: how often should we fetch those conditions: on every VCF data request / once per session / once per datasets etc.?

mzueva commented 3 years ago

@rodichenko I think highlight value shall be calculated on the client, I'd suggest to use an existing endpoint "/defaultTrackSettings" to get these settings.

rodichenko commented 3 years ago

@mzueva sure, it's definitely the only option after the last edition of the issue 😃

NShaforostov commented 3 years ago

Possible enhancements


1. Short profiles info in the Settings

As only admins can configure profiles, it may be useful to show info about the certain profile - for "general" users. For that, add an "Info" icon near the profile selected in the Settings, e.g.: image When user clicks that icon - the profile short description is shown in a tooltip, e.g.: image

Here, a list of all configured profile's conditions and the corresponding highlighting colors should be displayed.


2. Highlighting on big scales

Think about highlighting variants of interest, when the big scale is set for the browser. Via the implementation described above (in the issue), variants of interest can be not very noticeable if for the browser the big scale is set. Perhaps, another approach is required for highlighting in such case.

NShaforostov commented 3 years ago

Example of the testing profile:

{
    "testing_profile" : {
        "is_default" : true,
        "conditions" : [
            {
                "highlight_color" : "0083FF",
                "condition" : "('ReadPosRankSum' > '1') and ('ReadPosRankSum' < '3') and ('Dp' <= '200')"
            },
            {
                "highlight_color" : "FF00B2",
                "condition" : "('ReadPosRankSum' < '-1') and ('ReadPosRankSum' > '-3') and ('Dp' <= '200')"
            },
            {
                "highlight_color" : "D8FF00",
                "condition" : "(('BaseQRankSum' > '1') or ('BaseQRankSum' < '-1'))  and ('Mq' > '83') and ('Dp' > '260')"
            },
            {
                "highlight_color" : "149150",
                "condition" : "('Dp' in '[386, 435]') and ('Sor' == '1.461')"
            },
            {
                "highlight_color" : "CE632D",
                "condition" : "('Dp' in '[386, 435]') and (('Sor' == '1.461') or ('Sor' == '1.227'))"
            }
        ]
    }
}

Should be tested on "dm6_data" dataset. Check the following position: