Closed holtgrewe closed 2 years ago
Root Cause Analysis
We are using a LEFT JOIN
in the SQL. If there are multiple Clinvar records (there is one per disease) this generates multiple output records.
Dear @holtgrewe ,
I think our problem is linked to this issue but not dependent on the "Clinvar pathogenic" preset:
In our Athenea 1.2.1 installation when analysing NA-12878 for the reported CYP2C19:c.681G>A variant with all filters relaxed and "Gene Allowlist" set to "CYP2C19" duplicated lines are observed, e.g. for chr10:96,535,124.A>G
57 entries where displayed. I commented on one of the entries and all duplicates where shown as "commented" which indicates that the same entry is shown multiple times.
We checked if the imported TSV has multiple variant entries but it was not the case.
Config:
Results (Snippet):
Best,
@your-highness We always join with clinvar. This also occurs outside of the clinvar filter but there the problem is obvious whereas the variants are not so obviously spotted outside.
Further Root Cause Analysis Originally, we had only one clinvar record in our internal tables. This changed in the last data release. We did not properly catch this change in VarFish Server.
Resolution Proposal
Adjust ExtendQueryPartsClinvarJoin
to use a lateral join and merge all records.
Affected Components VarFish Server
Affected Modules/Files
variants.queries
Required Architectural Changes None
Required Database Changes None
Backport Possible? Yes
Resolution Sketch
ExtendQueryPartsClinvarJoin
with the proposed change.ExtendQueryPartsClinvarJoinAndFilter
to still work correctly.
Describe the bug When using the "Clinvar pathogenic" preset duplicate lines are possible.
To Reproduce Steps to reproduce the behavior:
Expected behavior Only one line per variant should be shown.
Screenshots See above.
Additional context