Closed cannin closed 3 years ago
Interesting. Seems like the preprint hasn't published yet? Would it be possible for a student to update the preprint as a co-author after finishing the above goals and also doing some additional analyses? @cannin
@chilampoon you are correct the pre-print is not published. If the summer code work is included as part of the final publication, it would be natural for the student to be included. I have written papers in the past with GSOC students.
@cannin Got it. What are the expectations for this project to get integrated into the final publication? I do have some thoughts on it, I'll contact you later on since it's still in the organization application period now.
@chilampoon there are others involved in the project. i can only speak for myself. the code has to 1) add new features, 2) be tested, and 3) be documented.
@cannin I had gone through the algorithm . Great Algorithm ! . I have some couple of questions in my mind. Presently in the algorithm there are 3 input data types (exp, cna, mut) what additional data types need to be added to the algorithm a brief info on it would do good ? what does the mapping in the goal of the project refer to some info on it ?
@patelaryan7751 apologies, i never saw this message. 1) it is not so much about what additional types. exp, cna, mut are actually dealt in 2 ways either discrete or continuous. it would be nice abstract the code so, for example, 5 data sets could me used (e.g., 3 discrete and 2 continuous). the data type biology details are unnecessary for a GSOC student. 2) the code generates similarities values between objects (i.e., cell lines and tumor samples); you could think of these objects as nodes in a graph and similarities between objects above a threshold (e.g., 0.5) to have an edge.
This is an active GSoC 2021 project. The issue will be closed for the duration of GSoC since it is no longer available to other students.
Background
TumorComparer (https://www.biorxiv.org/content/10.1101/028159v1) is an algorithm for mapping experimental model systems (i.e., cell lines) to patient samples by examining various -omic profiles.
Goal
The goal is to 1) expand the utility of the algorithm to additional data types (currently hard-coded to 3 data types), 2) provide examples that are patient to patient similarity, and 3) provide a network visualization (using igraph: https://igraph.org/) of the resulting mapping.
Difficulty Level 1
Much of the work will be refactoring existing code that would serve as a guide for the student.
Skills
Public Repository
Potential Mentors
Augustin Luna