ShixiangWang / sigminer

🌲 An easy-to-use and scalable toolkit for genomic alteration signature (a.k.a. mutational signature) analysis and visualization in R https://shixiangwang.github.io/sigminer/reference/index.html
https://shixiangwang.github.io/sigminer/
Other
144 stars 18 forks source link

Add annotations for exposure profile show_sig_exposure(mt_sig) #433

Closed xiw588 closed 1 year ago

xiw588 commented 1 year ago

Hi I want to ask if it is possible to include annotations by clinical variable at the bottom of exposure profile, just like the example figures I show here.

Screenshot 2023-05-16 at 1 46 11 PM
ShixiangWang commented 1 year ago

Hi @xiw588, there is an option groups for setting one group variable (https://shixiangwang.github.io/sigminer/reference/show_sig_exposure.html). However, the sigminer does not support multiple variables. As the return plot is ggplot, I think it may not be hard to align the exposure profiles with clinical variables with tools like cowplot or patchwork. Or you can output the exposure for visualizing with other tools. I will not support this as it's too customized and out of the current focus of sigminer.

xiw588 commented 1 year ago

Hi Shixiang, thank you for your response! This is already very helpful! I have a follow up question, so I have ~2000 samples and this makes the figure all black. Do you know if there is any way to decrease the width of each sample so that it can be shown in expected colors.

ShixiangWang commented 1 year ago

@xiw588 Yeah. You can check the options start with rm_, especially the rm_space (set rm_space = TRUE). The black figure is due to the showing of the border color of rectangles.

xiw588 commented 1 year ago

Hi Shixiang, Thank you for your helpful comments! I have one related question - Do you know what is the order of the samples being plotted using show_sig_exposure ()? I have ordered the samples before feeding in, but it doesn't seem to work as expected.

ShixiangWang commented 1 year ago

@xiw588 Thanks for your question, I will check this.

ShixiangWang commented 1 year ago

Hi @xiw588 , please install the latest version of sigminer from GitHub. An option samps is added to show_sig_exposure() to filter or sort samples.

remotes::install_github("ShixiangWang/sigminer")
load(system.file("extdata", "toy_mutational_signature.RData",
                 package = "sigminer", mustWork = TRUE
))
# Show signature exposure
p1 <- show_sig_exposure(sig2, rm_space = TRUE)
p1

expo = sig_exposure(sig2)
show_sig_exposure(expo,
                  rm_space = TRUE,
                  samps = colnames(expo)[order(colSums(expo))])
xiw588 commented 1 year ago

Thanks so much Shixiang, it works!!

ShixiangWang commented 1 year ago

Happy to hear that.

xiw588 commented 1 year ago

Thank you for your help Shixiang. I have a follow-up question. The dimension of mt_tally$nmf_matrix is 118696, but after running sig_extract() as below, I only got 1105 samples (the dimension of mt_sig$Exposure is 31105. Do you know the reason and is there any way that I can impose mutational signatures for all of the 1186 samples? Thanks!!

mt_sig <- sig_extract(mt_tally$nmf_matrix, n_sig = 3, nrun = 5, )

xiw588 commented 1 year ago

Also, I got values >1 for samples in mt_sig$Exposure. Can you please clarify why I had proportion of mutational signatures > 1, and how to avoid this? Thanks!

ShixiangWang commented 1 year ago

@xiw588 There are two types of values for representing the activity of mutational signatures, one is absolute exposure and the other is relative exposure. At default, sigminer prefers absolute exposure. You can check more elements of mt_sig or the functions obtaining the mt_sig to see how to get relative exposures.

xiw588 commented 1 year ago

Thanks Shixiang, I will look into that. Can you help me with the other question - why some of the samples don't have estimated exposure of mutational signature from sig_extract()?

ShixiangWang commented 1 year ago

Did you get message like The follow samples dropped due to null catalogue, otherwise, the result should contain data for all samples.

xiw588 commented 1 year ago

Yes, I received this message! This doesn’t mean that there are zeros in the matrix for these samples that prevents the estimation, right? Can you please explain a bit more on the meaning of null catalogue? Or is there any document/refer I can look into?

Thanks!

On Sun, Jun 11, 2023 at 21:39 Shixiang Wang (王诗翔) @.***> wrote:

Did you get message like The follow samples dropped due to null catalogue, otherwise, the result should contain data for all samples.

— Reply to this email directly, view it on GitHub https://github.com/ShixiangWang/sigminer/issues/433#issuecomment-1586441557, or unsubscribe https://github.com/notifications/unsubscribe-auth/AHPUYPHTSF5Y5VJBLRPTKL3XKZXL3ANCNFSM6AAAAAAYD7T3CM . You are receiving this because you were mentioned.Message ID: @.***>

-- Xin-An Wang, PhD, MS Postdoctoral Research Fellow Environmental Health | Harvard T.H. Chan School of Public Health Mobile: +18578694016 Email: @.***

ShixiangWang commented 1 year ago

Hi, @xiw588, this means that the dropped samples have no mutations. You can double-check the data of these samples. And you can also read the function (https://github.com/ShixiangWang/sigminer/blob/HEAD/R/sig_extract.R) processing the data, it is not hard to read.

https://github.com/ShixiangWang/sigminer/blob/a13159215326dd8046315cf3c0a8625daadaa1fd/R/sig_extract.R#L45-L52

xiw588 commented 1 year ago

Thanks Shixiang!!I got it, this is very helpful!