AlexandrovLab / SigProfilerExtractor

SigProfilerExtractor allows de novo extraction of mutational signatures from data generated in a matrix format. The tool identifies the number of operative mutational signatures, their activities in each sample, and the probability for each signature to cause a specific mutation type in a cancer sample. The tool makes use of SigProfilerMatrixGenerator and SigProfilerPlotting.
BSD 2-Clause "Simplified" License
153 stars 51 forks source link

Plot produce with sig. mutations at 0 #92

Closed natir closed 3 years ago

natir commented 3 years ago

Hi,

We use SigProfiler in our cancer genome pipeline, on some sample decomposition verbose mode produce this output:

                                        ################ Sample 1 #################
############################# Initial Composition ####################################
   SBS1  SBS5  SBS96A
0   1.0   1.0   68.0
L2%:  0.055660557522576835
############################## Composition After Initial Remove ###############################
   SBS1  SBS5  SBS96C
0   0.0   0.0   70.0
L2%:  0.05906191827907873

Our sample match with SBS96C but figure produce contains graph for SBS1 and SBS5 this creates confusion. moreover, the number of sig. mutations signatures indicated is 0.

Code use to produce this figure:

    signatures = (
        output_dir
        + "SBS96/Suggested_Solution/De_Novo_Solution/De_Novo_Solution_Signatures_SBS96.txt"
    )
    activities = (
        output_dir
        + "SBS96/Suggested_Solution/De_Novo_Solution/De_Novo_Solution_Activities_SBS96.txt"
    )
    samples = output_dir + "SBS96/Samples.txt"
    output = output_dir + "Deconvolution_SB96_DeNovo"
    decomp.decompose(
        signatures, activities, samples, output, genome_build="GRCh38", verbose=True
    )
mdbarnesUCSD commented 3 years ago

Hi @natir, could you please provide your log file as well as the inputs used to generate the figures?

Thanks!

natir commented 3 years ago

Hi @mdbarnesUCSD,

We work with real human medical data so we can't provide VCF, but only mutational matrix, is this enough?

Thanks

mdbarnesUCSD commented 3 years ago

Yes, that would be great!

mdbarnesUCSD commented 3 years ago

Please reopen this issue if you are still encountering this problem.

natir commented 3 years ago

Hello,

Sorry to have taken so long to answer. We deal with a lot of samples, I don't have the data of the original problem anymore but I managed to keep the data of another similar case.

Which file do you exactly need ?

natir commented 3 years ago

I can't reopen this issue.

mdbarnesUCSD commented 3 years ago

Please provide the following files so that I can reproduce the plot above:

signatures = (output_dir+ "SBS96/Suggested_Solution/De_Novo_Solution/De_Novo_Solution_Signatures_SBS96.txt") activities = (output_dir+ "SBS96/Suggested_Solution/De_Novo_Solution/De_Novo_Solution_Activities_SBS96.txt" ) samples = output_dir + "SBS96/Samples.txt"

natir commented 3 years ago

You can find all of this her https://drop.infini.fr/r/xHsmyAxFLm#8sQTh/rXU10b9lgYyerBtK85gBYd1h5qmkbCNQUigHM=

Thanks for your help

mdbarnesUCSD commented 3 years ago

Hi @natir,

This is actually default behavior. The tool tries to introduce signatures 1 and 5 in all samples. If the signatures are not assigned then they appear with 0 mutations attributed. If you have any more questions please reach out.

natir commented 3 years ago

How we can change this behavior ?

mdbarnesUCSD commented 3 years ago

We will be looking in the future to make updates to the decomposition module. We will look into modifying this behavior when we do so.

oliverartz commented 2 years ago

I have a similar question: I understand that SBS1 and SBS5 might appear in the decomposition plots although we have 0 mutations corresponding to those signatures. In that case, I would expect the contribution to the final signature to be 0%. I do, however, sometimes observe, that SBS1 or SBS5 contribute 0 mutations but around 2-3% to the final signature. Is there an explanation for that? Where do the 2-3% come from, if we have no mutations that contribute?

Thanks for your help and the development! Great tool!