cBioPortal / GSoC

Documentation repository of Google Summer of Code (GSoC) project ideas for cBioPortal and related projects
106 stars 41 forks source link

Similar Patient Discovery #116

Open inodb opened 3 months ago

inodb commented 3 months ago

Background: cBioPortal is an open-source platform designed to provide a web interface for exploring, visualizing, and analyzing cancer genomics data, and has grown to be widely used by researchers and clinicians worldwide. The current interface provides comprehensive tools for individual patient data exploration, including mutations, copy number variations, and clinical information as well as cohort exploration, analytics, and cohort comparisons. A user can find similar patients by using the interface to look for patients that e.g. are of the same cancer type, have similar mutations, or received the same treatment. There are currently however no similar patients proposed automatically; finding similar ones requires many manual steps. Here, we propose to develop a new web service that would recommend similar patients a user could explore given a patient's molecular and clinical profile. In oncology, where genetic mutations and biomarkers play critical roles in determining the most effective treatments, the ability to easily find and compare similar patient cases is invaluable. Moreover, a patient similarity function within cBioPortal would empower users to leverage the vast amounts of data available in the portal more effectively. By integrating sophisticated similarity search capabilities, users could identify cohorts of patients based on specific criteria, compare their genomic landscapes, and analyze their treatment outcomes.

image

Goal: Develop a REST API that provides patient similarity information given a patient's molecular and clinical profile. For the similarity scoring we will use an existing algorithm

Approach: We will develop a backend web service for an existing Python-based algorithm that generates a model for identifying similar patients. This web service will provide a RESTful API to allow for communication of the cBioPortal frontend with the patient similarity model. These endpoints will be designed to handle real-time data exchanges, leveraging JSON for its versatility and efficiency in data transmission. To manage data updates to the patient similarity model whenever new cBioPortal data is added to the system we propose to leverage event-driven triggers. When new data enters the system, we rerun the pipeline to regenerate the model and redeploy the backend web service Whenever a user visits the frontend page it will be using this new backend web service. This ensures that the frontend displays the most current data, enhancing the user experience in exploring patient similarities. Additionally, to maintain system efficiency and prevent overload, it's crucial to optimize the data payload and update frequency based on user interaction and system capabilities

Need skills: Understanding of RESTful APIs, Familiarity with Python

Possible mentors: @Thahmina

domgor11 commented 3 months ago

Hi @inodb , I'm a backend software engineer with 2 year of experience, and will be starting masters in Health Data Science this fall. I have experience in building cloud backend services with APIs that focus on processing large volumes of incoming data. The described approach sounds to me fairly straightforward, and I'm confident that I'm capable of creating this backend service.

LinkedIn: https://www.linkedin.com/in/dominika-gorgosz Email: dominikagorgosz@icloud.com

It would be great to have a call, or discuss your precise expectations over email.

DininduChamikara commented 3 months ago

Hi @inodb, I am a fresh graduate and like to contribute to this project as a participant in the GSoC 2024. I have worked with some Natural Language Processing tasks and in there, I have worked with clustering models as well. In this project, I need to clarify some details.

Email - dininduchamikara99@gmail.com

mdkintu commented 3 months ago

hi @inodb where do i submit the proposal?