varfish-org / varfish-server

VarFish: comprehensive DNA variant analysis for diagnostics and research
MIT License
43 stars 11 forks source link

Clinvar-based filtration leads to duplicate lines #565

Closed holtgrewe closed 2 years ago

holtgrewe commented 2 years ago

Describe the bug When using the "Clinvar pathogenic" preset duplicate lines are possible.

To Reproduce Steps to reproduce the behavior:

  1. Filter the NA12878 trio with "Clinvar pathogenic" filter.
  2. See the result: image

Expected behavior Only one line per variant should be shown.

Screenshots See above.

Additional context

holtgrewe commented 2 years ago

Root Cause Analysis We are using a LEFT JOIN in the SQL. If there are multiple Clinvar records (there is one per disease) this generates multiple output records.

your-highness commented 2 years ago

Dear @holtgrewe ,

I think our problem is linked to this issue but not dependent on the "Clinvar pathogenic" preset:

In our Athenea 1.2.1 installation when analysing NA-12878 for the reported CYP2C19:c.681G>A variant with all filters relaxed and "Gene Allowlist" set to "CYP2C19" duplicated lines are observed, e.g. for chr10:96,535,124.A>G 57 entries where displayed. I commented on one of the entries and all duplicates where shown as "commented" which indicates that the same entry is shown multiple times.

We checked if the imported TSV has multiple variant entries but it was not the case.

Config: grafik

Results (Snippet): grafik

Best,

holtgrewe commented 2 years ago

@your-highness We always join with clinvar. This also occurs outside of the clinvar filter but there the problem is obvious whereas the variants are not so obviously spotted outside.

holtgrewe commented 2 years ago

Further Root Cause Analysis Originally, we had only one clinvar record in our internal tables. This changed in the last data release. We did not properly catch this change in VarFish Server.

holtgrewe commented 2 years ago

Resolution Proposal Adjust ExtendQueryPartsClinvarJoin to use a lateral join and merge all records.

Affected Components VarFish Server

Affected Modules/Files variants.queries

Required Architectural Changes None

Required Database Changes None

Backport Possible? Yes

Resolution Sketch

holtgrewe commented 2 years ago