Closed Marozi2 closed 3 years ago
Hi,
Thanks for your question. I am not quite aware of the data/results uploaded to the ICGC portal. However, I have the recent results analyzed from the PCAWG dataset with our current tool. I have uploaded the results extracted from Billiary-AdenoCA. Please see if the results match the results you extracted.
Biliary-AdenoCA.zip
Thanks, Mishu
Hi,
Thank you very much. Could you please send me the input file you used and the exact SigProfilerExtractor command you used with all options to try to reproduce your result because I don't have the same. Here is my result: Biliary.AdenoCA.zip
Thank you
JOB_METADATA.txt I have attached the parameters I used. You will find that in the JOB_METADATA.txt file. Alternatively, you can send your JOB_METADATA.txt file.
Thanks, Mishu
I know this issue is closed, but just throwing this out - i think the number of signatures you are running is likely to be part of the problem.
i have not done this kind of analysis exactly, but a lot of related ones. will definitely change the loadings a fair amount.
Hi,
I'm trying to reproduce some results with SigProfiler but I didn't succeed. I've made, for each cancer type, mutational catalogs from this file
WGS_PCAWG.96.csv
downloaded fromhttps://dcc.icgc.org/releases/PCAWG/mutational_signatures/Input_Data_PCAWG7_23K_Spectra_DB/Mutation_Catalogs_--_Spectra_of_Individual_Tumours/WGS_PCAWG_2018_02_09.zip
I've run SigProfilerExtractor on each cancer by calling at least 1 signature and at most 7 signatures with a refit on COSMIC signatures version 3.2. I was expecting to get really close results to
SigProfilier_PCAWG_WGS_probabilities_SBS.csv
downloaded fromhttps://dcc.icgc.org/releases/PCAWG/mutational_signatures/Attributions_to_Each_Mutational_Class/SP_Attributions_to_Each_Mutational_Class/SigProfilier_PCAWG_WGS_probabilities_SBS.csv
. Command used:sig.sigProfilerExtractor("matrix", outputfile, inputcatalog, seeds="random", reference_genome="GRCh37", opportunity_genome="GRCh37", matrix_normalization="gmm", cosmic_version=3.2, resample = True, context_type="SBS96", exome=False, minimum_signatures=1, maximum_signatures=7, nmf_test_conv=1000, nmf_replicates=10, clustering_distance="cosine", min_nmf_iterations=3000, refit_denovo_signatures=True, nmf_init="random", nnls_add_penalty=0.05, nnls_remove_penalty=0.01, initial_remove_penalty=0.05, make_decomposition_plots=True, get_all_signature_matrices=False, cpu=cpu)
I compared the output file
Decomposed_Mutation_Probabilities.txt
from my run withSigProfilier_PCAWG_WGS_probabilities_SBS.csv
but unfortunately I have different results (different percentages in different signatures). Also one of the issues is that I often have 3 or 4 out of 7 signatures unrefitted to COSMIC signatures and these "new" signatures, most of the time, account for the major part of mutation probabilities. I also tried to run SigProfilerSingleSample on these same data to avoid the problem of unrefitted signatures and still hoping to get close results toSigProfilier_PCAWG_WGS_probabilities_SBS.csv
. Again, the results are really different fromSigProfilier_PCAWG_WGS_probabilities_SBS.csv
.Could you explain, please, where does this non-reproducibility come from? Does
SigProfilier_PCAWG_WGS_probabilities_SBS.csv
correspond to SigProfilerExtractor output ofWGS_PCAWG.96.csv
? Do you proceed to other steps between SigProfilerExtractor output andSigProfilier_PCAWG_WGS_probabilities_SBS.csv
?Thank you.