nrnb / GoogleSummerOfCode

Main documentation site for NRNB GSoC project ideas and resources
115 stars 39 forks source link

Develop a Reactome Curation Support Tool that Maps Text to the Reactome Pathway Hierarchy #168

Closed cannin closed 2 years ago

cannin commented 3 years ago

Background

Reactome (https://reactome.org/) is a free, open-source, curated, and peer-reviewed pathway database. The project features a number of curators that continually add new interaction information from the scientific literature. Curators go through many papers and the suitability of the information in one paper to a particular pathway may not be obvious.

Goal

During GSOC 2020 (https://github.com/cannin/enhance_nlp_interaction_network_gsoc2020), work was done to map publications to the Reactome hierarchy of pathways. This was done by building a vector embedding of MeSH terms (https://www.ncbi.nlm.nih.gov/mesh/) for each pathway to which a cosine similarity calculation could identify the most related pathway for a query given vector embedding of MeSH terms (MeSH terms for a text provided by: https://ii.nlm.nih.gov/MTI/). The goal here is to build on this and be able to generate this vector embedding for a publication given by a curator very quickly (either from the full-text or the PubMed abstract) and provide this as a callable service (potentially a Flask-based API).

Difficulty Level 2

Conceptually easy, but depends on building on a previous student's work and producing a robust prototype.

Skills

Public Repository

Potential Mentors

Augustin Luna Guanming Wu

ankits743 commented 3 years ago

Hello @cannin I find this issue interesting and would like to work on it. After going through the required resources, how should I proceed with the task? Also, what is the preferred method of contact?

cannin commented 3 years ago

@ankits743 Communication here on GitHub is best. For a proposal that you might submit, you should start preparing: 1) a CV, 2) a plan with specific places in the code that would need modification, and 3) demonstrated understanding of generating APIs (code samples always best).

anshalshukla commented 3 years ago

Hello @ankits743 are you still interested working on this? I find my skill-set aligned to this project and would like to work on it.

anshalshukla commented 3 years ago

Hey @cannin Should I share link to my resume here? Also is it fine if I work on project already taken by someone else.

ankits743 commented 3 years ago

Hello @ankits743 are you still interested working on this? I find my skill-set aligned to this project and would like to work on it.

Hello @anshalshukla, Yes, I'm still interested along with that working on the plan and code samples.

anshalshukla commented 3 years ago

Hello @ankits743 If you are fine, we can together work on this. I have experience working with Flask and Rest APIs if you want I can help you with those.

AlexanderPico commented 3 years ago

Hi @anshalshukla. This is a proposed project idea for Google Summer of Code 2021. Right now, each of you should be focusing on preparing the best possible application. That involves independently researching the topic, enganging with the mentor (canin) and following their guidance (see his 3 points above). This is a competitve program and only one student can be selected per project.

We all enjoy collaboration and sharing knowledge, but there are a few rules and restrictions in the case of GSoC-sponsored projects like these.

Please learn more about GSoC here: https://google.github.io/gsocguides/student/. And review the timeline here: https://summerofcode.withgoogle.com/how-it-works/#timeline

khanspers commented 2 years ago

Cleanup in preparation for GSoC 2022.