nrnb / GoogleSummerOfCode

Main documentation site for NRNB GSoC project ideas and resources
114 stars 38 forks source link

Expanding Community Detection Algorithms in netboxr #201

Closed cannin closed 2 years ago

cannin commented 2 years ago

Background

netboxr (https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0234669) is an R package for the automated discovery of biological process modules by network analysis.

Goal

The goal is to add additional analysis features into netboxr, specifically

  1. The ability to support weighted nodes in the module detection
  2. The ability to parameterize the sizes of returned modules

This additionally will involve making use of module (community) detection algorithms in igraph (https://igraph.org/r/html/latest/communities.html)

Additionally, time permitting, the goal will then be to process large cancer datasets from TCGA using netboxr. See this publication for a listing of the ~20 TCGA datasets available (https://pubmed.ncbi.nlm.nih.gov/29625050/) and information on available data. Actual data will be retrieved from cBioPortal through the datahub repository (https://github.com/cBioPortal/datahub/tree/master/public).

Getting Started

Eventually, you'll need to write a proposal. Elements of this proposal should include:

  1. Try out netboxr on 1-2 different sets of data from cBioPortal given the instructions in the tutorial
  2. An understanding of the igraph community methods
Getting Help

Post issues if you encounter bugs in the tutorial here: https://github.com/mil2041/netboxr/issues

Difficulty Level: Easy

Navigating the documentation for the Galaxy platform may have some difficulty.

Size and Length of Project

Size: 175 hours Length: 12 weeks

Skills

Public Repository

Potential Mentors

Augustin Luna Eric Minwei Liu

pramitsahoo commented 2 years ago

hi @cannin i am new to open source. i am willing to work on this project. can you please guide me ?

cannin commented 2 years ago

@pramit026 I'm closing this issue in favor of #193. Check out the section on "Getting Started" on #193 to begin.