cBioPortal / GSoC

Documentation repository of Google Summer of Code (GSoC) project ideas for cBioPortal and related projects
113 stars 43 forks source link

Integrate AlphaMissense into Genome Nexus and cBioPortal #115

Open leexgh opened 8 months ago

leexgh commented 8 months ago

Background:

Goal:

  1. Integrate AlphaMissense into Genome Nexus API response.
  2. Show AlphaMissense pathogenicity prediction and score on Genome Nexus variant page, e.g: Screenshot 2024-03-21 at 10 46 03 PM
  3. Add a new column in cBioPortal results view mutation table, show AlphaMissense pathogenicity prediction and score in the column. The column should be able to sort, filter and download. Screenshot 2024-03-21 at 10 49 27 PM
  4. Add AlphaMissense pathogenicity prediction and score into Genome Nexus annotation pipeline as two new columns in the annotation result file.

Approach: Genome Nexus uses VEP as the prime annotation source. The AlphaMissense is available as a plugin in VEP: https://useast.ensembl.org/info/docs/tools/vep/script/vep_plugins.html#alphamissense Integrate AlphaMissense annotations through VEP plugin. Need skills: Backend and frontend programming skills. Java, Springboot, React, Typescript Possible mentors: @leexgh @zhx828

ZeeJJ123 commented 8 months ago

@leexgh Could you please provide guidance on how I can contribute to this project? While I have a strong background in backend and frontend development, I am unfamiliar with AlphaMissense, Genome Nexus API, and cBioPortal. Thank you.

JunheZoooo commented 8 months ago

@leexgh I have a background in Java, Spring Boot, React, and TypeScript. I'm also interested in biology. I have thought about integrating AlphaMissense. I have a few ideas and clarifications to discuss.

We could use a scheduled task or webhook. It would sync the data from AlphaMissense with Genome Nexus and cBioPortal. Does a process handle data inconsistencies from database updates? Or, should we make a versioning system for genetic data to track changes over time?

I'm considering an asynchronous processing model for real-time computation. It could queue up intensive tasks to avoid blocking the main application thread. Would this fit in the current infrastructure? Or, is there a better way to handle slow operations?

I first plan to add an interactive tooltip or expandable section to cBioPortal. It will show the AlphaMissense pathogenicity scores. This should provide detailed information on demand without overwhelming the main interface. Does this align with the project's vision for maintaining a seamless user experience?

leexgh commented 8 months ago

@JunheZoooo Thanks for looking at it! Your proposal sounds good to me. Although our initial plan is to add AlphaMissense into Genome Nexus API response through VEP plugin (https://useast.ensembl.org/info/docs/tools/vep/script/vep_plugins.html#alphamissense), other plans are also welcome! An interactive tooltip or expandable section to cBioPortal is a good way to show the AlphaMissense scores, we also need it on Genome Nexus website (and maybe Oncokb website). If you are interested in this project, please join our public cBioPortal channel: [slack.cbioportal.org] (http://slack.cbioportal.org/) to discuss more about it.