cBioPortal / GSoC

Documentation repository of Google Summer of Code (GSoC) project ideas for cBioPortal and related projects
106 stars 41 forks source link

AI/LLM Generated gene alteration and expression based subtyping for each tumor type #114

Open inodb opened 3 months ago

inodb commented 3 months ago

Background:

Goal:

Approach:

Need skills: Prompt Engineering, Python or similar scripting language

Possible mentors: @inodb

SelahattinAksoy commented 3 months ago

Hi Dear @inodb,

During my tenure of nearly two years on the TCGA project, I delved into drug repurposing for cancer genes, leveraging my expertise accumulated over more than three years in ML/AI engineering. My master's thesis honed in on the utilization of BRCA2 and EGFR for drug repurposing. With a background in Computer Science and ongoing pursuit of a bioinformatics master's program, I've also contributed to LLM projects at Radboud University in the Netherlands, employing models like BERT and ROBERTA and engaging in chatbot development and possess experience in API-based LLM querying.

My professional journey underscores a commitment to applying machine learning and deep learning to extract insights from scientific data, notably in bioinformatics. This journey encompasses roles such as a Machine Learning Engineer at Kedi Mobile App Inc., and a Research/Teaching Assistant at Mugla University, where I've contributed extensively to the field.

In addition to my academic and professional endeavors, I've contributed to journal publications and international poster presentations, showcasing my research contributions in areas like skeletal malocclusion prediction and virus bioinformatics. Furthermore, I've been involved in various projects, such as developing diagnostic models, early diagnosis systems, and bioinformatics platforms. Noteworthy projects include the development of a sysbtem for early diagnosis of skeletal orthodontic malocclusions and a web platform for drug repurposing analysis for cancer cohorts.

I am enthusiastic about the possibility of contributing to this project, as I believe it aligns seamlessly with my career goals. To express my interest further, I would like to inquire about the possibility of scheduling a brief discussion regarding my application. Could we coordinate a convenient time to connect? I am eager to explore the potential opportunity to discuss my application in greater detail.

Linkedin : https://www.linkedin.com/in/selahattin-aksoy-873424189/ Mail : selahattin2322@gmail.com Publications : https://scholar.google.com/citations?user=FdnFJNAAAAAJ&hl=en

Steveolas commented 3 months ago

Super interesting, I don't know much about genetics so I may be off here. But I think it may be worth your while to check out the Mamba model. Although it is a smaller model it can be fine-tuned, and it has a different architecture than transformers (Most other LLMs) that makes it better for large context problems. And genomic problems (from my understanding) can generally have large contexts.

Ilan

sohamchatterjee50 commented 3 months ago

Hey @inodb

I am currently pusrsuing my Masters in Aritificial Intelligence from University of Amsterdam. Prior to this, I worked at Amazon as an Applied Scientist where I was responsible for finetuning models for rereanking purposes. As part of both professional and research activities, I have delved into LLMs for prompt engineering. As part of the research ongoing at UvA, I am exposed to a lot of cutting edge reserch, AI for Science. I would love to contribute into an area which involves both AI and biology, especially bioinformatics.

Mail: sohamchatterjee50@gmail.com LinkedIn: https://www.linkedin.com/in/soham-chatterjee-3410abb8/ It would be great if we can connect sometime on call.

SumitdevelopAI commented 3 months ago

Hello @inodb

My name Sumit Sharma. Currently Pursuing my B.tech in AI from Parul University. please help how to apply for this project.

wuyuqing0327 commented 3 months ago

Hi @inodb

My name is Yuqing Wu, preferred as Chelsea. I'm currently a master's degree student in Data Science at the University of Chicago. Before this program, I also worked as a machine learning engineer for over 5 years. I currently have a part-time job as a Research Assistant at the Institute of Population and Precision and Health to do machine learning algorithms to identify the impact of microbiome and Bacteroides on blood pressure, preventing people's diseases.

I'm really interested in this project, I can leverage LLM to process and identify text-based information, identifying genes that are relevant for the molecular classification of cancer subtypes.

Linkedin: https://www.linkedin.com/in/chelsea-uchi0327 E-mail: wuyuqing0327@gmail.com/yuqingw1@uchicago.edu

RainieFu commented 3 months ago

Hi @inodb,

I'm Rainie, a third-year undergraduate student at the University of British Columbia, majoring in Computer Science and Statistics. Currently, I am doing a full-time internship as a Bioinformatics Research Assistant at Vancouver Prostate Cancer, where my focus lies in genomic and statistical analysis within the realm of Prostate Cancer. Here, I'm deeply engaged in identifying potential biomarkers to introduce new treatment arms in clinical trials, leveraging a diverse set of technologies encompassing both biological methodologies and data-driven analytics.

My academic journey has equipped me with a solid foundation in data mining and machine learning, as well as proficiency in scripting languages like python and R, enabling me to grasp intricate machine learning algorithms and well-known libraries with ease. I am genuinely passionate about contributing my skills and knowledge to projects that make a tangible impact. This project is the perfect intersection of my academic interests and practical skills. I'm eager to join forces, brainstorm solutions, and ultimately contribute to advancements in cancer research.

Looking forward to the possibility of working together on this exciting endeavor, and I am happy to discuss further about my application. Feel free to connect with me via the following:

Linkedin: https://www.linkedin.com/in/rainie-fu/ Email: rainiefu0813@gmail.com

inodb commented 3 months ago

Hi @SelahattinAksoy @Steveolas @sohamchatterjee50 @SumitdevelopAI @wuyuqing0327 @RainieFu!

Thanks so much for your interest! Unfortunately, I'm not able to meet with everyone, but want to encourage you all to try and submit a proposal for this project if you're interested. Make sure to submit it thru the https://summerofcode.withgoogle.com/ website before 4/2!

If you're able to share a proposal in a Google Doc as well before I'll do my best to provide some feedback, you can send it via a DM on https://slack.cbioportal.org

Thanks so much all!

manheraa commented 3 months ago

Hi @inodb I am Sachetan Heralagi, a student currently pursuing my undergraduate degree at KLE Institute of Technology In my academic journey, I have had the opportunity to explore various neural network architectures, from simple ANNs to more complex models like VGG19. One of the projects I take pride in is leading a team to develop an automated irrigation system using ANN technology for weather prediction and integration with Internet of Things (IoT) devices. Our project even made it to the finals of IEEE YESIST12, which was a significant achievement for us. And I have recently worked with overian carcinoma subtype classification in which we used Transfer Learning i.e(VGG19).

I am particularly drawn to this project because I have worked with natural language processing and deep learning and I am enthusiastic about the prospect of collaborating with like-minded individuals .And I have proficiency in different ml and dl liberaries like tensorflow,keras, pytorch .

linkedin :www.linkedin.com/in/sachetan-heralagi mail:manuheralagi4@gmail.com