AlexsLemonade / OpenPBTA-analysis

The analysis repository for the Open Pediatric Brain Tumor Atlas Project
Other
100 stars 67 forks source link

add Ns and percents to co-occurrence tables for paper #1188

Closed jharenza closed 3 years ago

jharenza commented 3 years ago

Purpose/implementation Section

What scientific question is your analysis addressing?

Added additional stats to tables for ease of manuscript writing.

What was your approach?

I added the following calculations to the cooccur_functions.R script and reran the code to add this information to the results tables

What GitHub issue does your pull request address?

NA

Directions for reviewers. Tell potential reviewers what kind of feedback you are soliciting.

Which areas should receive a particularly close look?

Updates in cooccur_functions.R

Is there anything that you want to discuss further?

Added comment within the function R script:

#' @return A data frame summarizing the co-occurence of pairs of genes in the
#'   gene list with columns `gene1`; `gene2`; counts of each mutations in
#'   each category of sharing (`mut11`: both mutated; `mut10`: mutated in
#'   the first but not second gene, etc.) and summary (`n_mutated_gene1` and 
#'   `n_mutated_gene2`: number of samples with gene1 or gene2 mutated, ) `n_mutated_gene1`); 
#'   `odds_ratio` for co-occurence, `cooccur_sign`: 1 if co-occurence greater than by chance, 
#'   -1 if less frequent than expected; `p` the fisher's exact test p value; and a
#'   `cooccur_score` calculated as `cooccur_sign * -log10(p)`. Overall percent 
#'   of samples mutated denoted as `perc_mutated_gene1` and `perc_mutated_gene2`. 
#'   Percent of gene1-mutated samples in w#' @return A data frame summarizing the co-occurence of pairs of genes in the
#'   gene list with columns `gene1`; `gene2`; counts of each mutations in
#'   each category of sharing (`mut11`: both mutated; `mut10`: mutated in
#'   the first but not second gene, etc.) and summary (`n_mutated_gene1` and 
#'   `n_mutated_gene2`: number of samples with gene1 or gene2 mutated, ) `n_mutated_gene1`); 
#'   `odds_ratio` for co-occurence, `cooccur_sign`: 1 if co-occurence greater than by chance, 
#'   -1 if less frequent than expected; `p` the fisher's exact test p value; and a
#'   `cooccur_score` calculated as `cooccur_sign * -log10(p)`. Overall percent 
#'   of samples mutated denoted as `perc_mutated_gene1` and `perc_mutated_gene2`. 
#'   Percent of gene1-mutated samples in which gene2 mutations are co-occuring or mutually exclusive 
#'   denoted as `perc_cooccur_or_mutexcl`.

Is this clear?

Is the analysis in a mature enough form that the resulting figure(s) and/or table(s) are ready for review?

Yes

Results

What types of results are included (e.g., table, figure)?

Tables

What is your summary of the results?

Appended 5 columns to results tables. All other values and plots remain unchanged.

Reproducibility Checklist

Documentation Checklist

jharenza commented 3 years ago

thanks @jashapiro!