AlexandrovLab / SigProfilerClusters

Tool for analyzing the inter-mutational distances between SNV-SNV and INDEL-INDEL mutations. Tool separates mutations into clustered and non-clustered groups on a sample-dependent basis.
BSD 2-Clause "Simplified" License
11 stars 1 forks source link

rainfallPlots is missing. #12

Closed xysj1989 closed 2 years ago

xysj1989 commented 2 years ago

Dear developer,

I tried to tested the example code of SigProfilerClusters at (https://osf.io/qpmzw/wiki/5.%20Quick%20Start%20Example/)

My input file is BRCA-UK_SP116353.snv_mnv.vcf

My code is as following:

from SigProfilerMatrixGenerator.scripts import SigProfilerMatrixGeneratorFunc as matGen from SigProfilerMatrixGenerator import install as genInstall from SigProfilerSimulator import SigProfilerSimulator as sigSim from SigProfilerClusters import SigProfilerClusters as hp

Output_Dir = "/Users/z-1-Test/SigProfilerClusters-test"

sigSim.SigProfilerSimulator("BRCA", Output_Dir, "GRCh37", contexts = ['288'], chrom_based=True, simulations=100) hp.analysis("BRCA", "GRCh37", "96", ["96"], Output_Dir)

However, in my output Dir (/Users/z-1-Test/SigProfilerClusters-test/plots), I can only get BRCA_intradistance_plots_288_corrected.pdf.

There is no rainfallPlots_clustered_project_corrected.pdf

Is there something wrong with the example code or current package? Or do you have extra parameters for the rainfallPlots ?

Thanks

ebergstr commented 2 years ago

Hi,

You will need to set subClassify=True. The rainfall plot relies on the subclassifications of clustered events, so without these categories, the rainfall plot is not produced.

I will close this issue, but please feel free to reopen if you experience further issues!

xysj1989 commented 2 years ago

Thanks for your reply. This time I try the following commands:

hp.analysis("BRCA", "GRCh37", "96", ["96"], subClassify=True, Output_Dir)

hp.analysis("BRCA", "GRCh37", "96", ["288"], Output_Dir, analysis="all", sortSims=True, subClassify=True, correction=True, calculateIMD=True, max_cpu=4, TCGA=True, sanger=False)

But the rainfall plot is still missing.

Here is the output from my terminal, with no errors reported.

====================================== Beginning SigProfilerClusters Analysis

Calculating mutational distances...Completed! Determining sample-dependent intermutational distance (IMD) cutoff...Completed!

Analyzing clustered mutations... Starting matrix generation for SNVs and DINUCs...Completed! Elapsed time: 2.06 seconds. Matrices generated for 1 samples with 0 errors. Total of 2257 SNVs, 701 DINUCs, and 0 INDELs were successfully analyzed.

Analyzing non-clustered mutations... Starting matrix generation for SNVs and DINUCs...Completed! Elapsed time: 1.83 seconds. Matrices generated for 1 samples with 0 errors. Total of 1198 SNVs, 0 DINUCs, and 0 INDELs were successfully analyzed.

Plotting SigProfilerClusters Results...Completed!

MousumyCSE commented 2 years ago

Hi, Can you please add a forward slash(/) when you mentioned the output directory and run again? For example: Output_Dir = "/Users/z-1-Test/SigProfilerClusters-test/"

Hope that solve your problem and let me know if you had any issues.

xysj1989 commented 2 years ago

Hi, Can you please add a forward slash(/) when you mentioned the output directory and run again? For example: Output_Dir = "/Users/z-1-Test/SigProfilerClusters-test/"

Hope that solve your problem and let me know if you had any issues.

The code works now! Thanks very much for your reply! This is really an interesting point. Without the slash, everything goes well, except for subclassification.