nrnb / GoogleSummerOfCode

Main documentation site for NRNB GSoC project ideas and resources
114 stars 38 forks source link

Supporting ssGSEA and Pathway Commons in clusterProfiler #222

Closed cannin closed 1 year ago

cannin commented 1 year ago

Background

Pathway Commons

Pathway Commons (http://pathwaycommons.org/) is an aggregated database of molecular interactions of millions of interactions. The data is aggregated from a collection of approximately 20 databases. Data from Pathway Commons is accessible here.

clusterProfiler

clusterProfiler (https://bioconductor.org/packages/release/bioc/html/clusterProfiler.html) provides a univeral interface for gene functional annotation from a variety of sources and thus can be applied in diverse scenarios. It provides a tidy interface to access, manipulate, and visualize enrichment results to help users achieve efficient data interpretation.

Goal

There are two goals:

How to Start

  1. Examine the Pathway Commons GMT datasets here: https://www.pathwaycommons.org/archives/PC2/v12/ the code example below using readGmt() from paxtoolr can help: https://bioconductor.org/packages/release/bioc/html/paxtoolsr.html and enricher() from clusterProfiler may help understanding the code.
pc <- readGmt("PathwayCommons.12.Reactome.GSEA.hgnc.gmt", removePrefix = TRUE)
pcGmt <- data.frame(ont=character(0), gene=character(0))
for(i in 1:length(names(pc))) {
  x <- names(pc)[i]
  pcGmt <- rbind(pcGmt, data.frame(ont=x, gene=pc[[x]]))  
}
pcGmt <- saveRDS(pcGmt, "pcGmt.rds")
egmt <- enricher(genes, TERM2GENE=pcGmt)
  1. Examine how other datasets are included in clusterProfiler: https://github.com/YuLab-SMU/clusterProfiler/blob/master/R/wikiPathways.R
  2. Examine ssGSEA functionality and how GSEA was implemented for clusterProfiler: https://github.com/YuLab-SMU/clusterProfiler/blob/master/R/gseAnalyzer.R
  3. Write your proposal for addressing the goals

Difficulty Level: Medium

Size and Length of Project

175 hours 12 weeks

Skills

R

Public Repository

Potential Mentors

harshagr70 commented 1 year ago

Hi @cannin,

I have worked on R earlier and will like to work on the project. Can I start drafting my GSoC proposal for the same?

cannin commented 1 year ago

@harshagr70 If you are still interested, yes you can draft a proposal. Helpful links:

GSoC contributor guide NRNB project proposal template Eligibility requirements Full program timeline

Jigyasa-G commented 1 year ago

Hi @cannin ! I am interested in drafting a proposal for this issue. Kindly guide me on the next steps to be taken.

cannin commented 1 year ago

@Jigyasa-G thanks, take a look at "How to Start".

Jigyasa-G commented 1 year ago

Sure thing, Thank you!

Jigyasa-G commented 1 year ago

@GuangchuangYu @cannin I have a couple of ideas for the above issue, should I include all of them in the proposal and justify which one would be more suited (with room for flexibility) or just pick one ?

cannin commented 1 year ago

@Jigyasa-G the proposal should cover the two main goals: Pathway Commons and ssGSEA. if you have multiple ideas for either of these i would just put your strongest/detailed one, otherwise your application might not be clear enough for proposal reviewers.

Jigyasa-G commented 1 year ago

@Jigyasa-G the proposal should cover the two main goals: Pathway Commons and ssGSEA. if you have multiple ideas for either of these i would just put your strongest/detailed one, otherwise your application might not be clear enough for proposal reviewers.

Thank you for mentioning, will work accordingly.

Jigyasa-G commented 1 year ago

Hey @cannin @GuangchuangYu ! I have submitted a proposal, should I start with any PRs if you suggest?

cannin commented 1 year ago

@Jigyasa-G Currently the proposals are being reviewed.

Jigyasa-G commented 1 year ago

Thank you for allowing me the opportunity to work on this project! I am super excited to move further. Kindly guide me about the next steps @cannin @GuangchuangYu

khanspers commented 1 year ago

This project is an active GSoC 2023 project. Closing this issue because it is no longer available for other contributors/students.

Jigyasa-G commented 1 year ago

@GuangchuangYu @cannin image I tried using the enricher() function for clusterProfiler however it seems like the function isn't available. I went through the docs but couldn't find why it is still unsupported.

GuangchuangYu commented 1 year ago

@Jigyasa-G Did you have the clusterProfiler package installed and loaded properly?

Jigyasa-G commented 1 year ago

@Jigyasa-G Did you have the clusterProfiler package installed and loaded properly?

It's working now with the updated R version.

Jigyasa-G commented 1 year ago

To be able to work with actual Biological data, I would need resources to understand the biology behind to be able to work out the methods. @GuangchuangYu @cannin Can you please suggest some?

Jigyasa-G commented 1 year ago

image @GuangchuangYu @cannin All the testthat tests have passed , however, I am unable to solve this error while using BiocCheck for testing. Kindly help

cannin commented 1 year ago

@Jigyasa-G please make issues/ask questions on the actual project and not here; please repost there and we'll answer them.