varfish-org / varfish-server

VarFish: comprehensive DNA variant analysis for diagnostics and research
MIT License
43 stars 11 forks source link

PEDIA integration into VarFish #399

Open ahujameg opened 2 years ago

ahujameg commented 2 years ago

Is your feature request related to a problem? Please describe. This is an improvement to integrate PEDIA into VarFish. PEDIA is an approach for prioritization of exome data by facial image analysis. It would be advantageous to include phenotypic scores such as PEDIA into VarFish. The VarFish users would be able to prioritize the variants based on the the face scores.

Describe the solution you'd like We can add another option for PEDIA score (like the Mutation Taster) in pathogenicity scoring method drop-down present in the Prioritization tab in the Filter Variants page. When the user selects the PEDIA score option and then clicks on 'Filter & Display' button, the user will be redirected to an external page of PEDIA. The filtered VCF will be sent to this external PEDIA server though the REST API. After login, the user can submit the facial images and the HPO terms for the individuals and hit the 'Get PEDIA score' button. The user will then be redirected back to VarFish along with the calculated PEDIA scores. Another column for showing the PEDIA score will be displayed in the results table.

Describe alternatives you've considered We've considered uploading images through Varfish interface itself instead of redirecting to an external link. This is not a good solution as VarFish should not be used for data management.

Additional context PEDIA (https://pedia-study.org/) GestaltMatcher (https://www.gestaltmatcher.org/)

Following tickets are sub-tasks of this feature:

  1. https://github.com/bihealth/varfish-server/issues/596
  2. https://github.com/bihealth/varfish-server/issues/1125
ahujameg commented 2 years ago
  1. VarFish will request GestaltMatcher DB service to get the Facial Score called the Gestalt scores along with HPO terms. 
  2. In the next step, VarFish will get the CADA scores from the CADA web service.
  3. Next, VarFish will send the CADD, CADA and Gestalt scores to PEDIA where the support vector machine is trained on these 3 scores.
  4. PEDIA will return the final PEDIA score which will be shown to the user in the table and also should be able to export in the report.Additionally, we can show the progress of each step in the workflow to the user.

This figure shows the different requests and interaction: image

ahujameg commented 2 years ago

The CADA REST API endpoint is working and available at “https://cada.gene-talk.de/api/process”. There was an internal discussion here and we are planning to integrate CADA first. I will open a separate ticket like a sub-task for this. Would it make more sense to include CADA option un the 'Phenotype Prioritization' on the left in the Algorithm drop-down or in the 'Pathogenicity Prioritization' in the scoring drop-down?