ShixiangWang / sigminer

🌲 An easy-to-use and scalable toolkit for genomic alteration signature (a.k.a. mutational signature) analysis and visualization in R https://shixiangwang.github.io/sigminer/reference/index.html
https://shixiangwang.github.io/sigminer/
Other
147 stars 19 forks source link

get_sig_feature_assocaiton() error #375

Closed xiw588 closed 3 years ago

xiw588 commented 3 years ago

Hi, I am trying to apply get_sig_feature_association to my dataset but got the following errors. My understanding is that this can be easily applied to any data frame, right?

get_sig_feature_association(ct_grps,ct_grps$group,ct_grps$enrich_sig) Error: Can't subset columns that don't exist. x Columns 1, 1, 1, 1, 1, etc. don't exist. Run rlang::last_error() to see where the error occurred.

Thank you.

ShixiangWang commented 3 years ago

@xiw588 Yeah. But you should follow the function documentation.

get_sig_feature_association(
  data,
  cols_to_sigs,
  cols_to_features,
  type = "ca",
  method_co = c("spearman", "pearson", "kendall"),
  method_ca = stats::wilcox.test,
  min_n = 0.01,
  verbose = FALSE,
  ...
)
Arguments
data    
a data.frame contains signature exposures and other features

cols_to_sigs    
colnames for signature exposure

cols_to_features    
colnames for other features

type    
a character vector containing 'ca' for categorical variable and 'co' for continuous variable, it must have the same length as cols_to_features.

method_co   
method for continuous variable, default is "spearman", could also be "pearson" and "kendall".

method_ca   
method for categorical variable, default is "wilcox.test"

min_n   
a minimal fraction (e.g. 0.01) or a integer number (e.g. 10) for filtering some variables with few positive events. Default is 0.01.

verbose 
if TRUE, print extra message.

... 
other arguments passing to test functions, like cor.test.

The second and third arguments should be colnames of the data.frame provided by the first argument.

xiw588 commented 3 years ago

Hi Shixiang,

Yes, the second and third arguments are indeed the columns of the dataframe. I tried different columns but failed all the time.

ShixiangWang commented 3 years ago

Could you show me the data or an example data to reproduce this?

xiw588 commented 3 years ago

Screen Shot 2021-07-27 at 10 19 36 PM
xiw588 commented 3 years ago

Here is a preview of the dataset, do you need me to send you the whole dataset? Thank you for your help!

ShixiangWang commented 3 years ago

@xiw588 This is not a valid input for this function.

An example:

library(sigminer)

set.seed(1234)
a <- rnorm(100)
set.seed(2345)
b <- rnorm(100)

df <- data.frame(
  s1 = a,
  s2 = b,
  f1 = 2 * a + 1,
  f2 = 3 * b - a,
  f3 = rnorm(100)
)

asso <- get_sig_feature_association(df, c("s1", "s2"), c("f1", "f2", "f3"), type = "co")

I think what you want is https://shixiangwang.github.io/sigminer/reference/get_group_comparison.html.

You can always read the reference list: https://shixiangwang.github.io/sigminer/reference/index.html

Functions are grouped/placed closely if they are used to work together.

image

xiw588 commented 3 years ago

Hi Shixiang,

Thank you very much for your quick response and detailed response. Actually, I want to reproduce the figures you did in your paper as below, which you also showed in the show_sig_feature_corrplot() in the tutorial. So in my case, I want to use this group_comparison set. Also, a quick follow-up question for the figures in your paper. I am curious how you defined the mutated pathways. Is this a binary definition that having certain genes in a specific pathway will be counted as mutated in this pathway?

Screen Shot 2021-07-27 at 10 33 34 PM Screen Shot 2021-07-27 at 10 33 43 PM

Thank you.

ShixiangWang commented 3 years ago
ShixiangWang commented 3 years ago

I am closing this issue as another issue is open to discuss the potential problem.