Closed Marozi2 closed 3 years ago
I seem to be having the same issue did you manage to fix this ?
Hello,
No sorry. I gave up trying to fix this. When needed, I do the refitting on 2013 signatures by creating a mutational catalog myself instead of using a vcf. Please, let me know if you find a solution.
Hi Marozi2, I just started working with SigprofilerExtractor, I was wondering how to create a mutational catalogue of my samples? Could you help me out please. Thanks in advance.
Hi Sonia-Bedi, I'm not sure there is a public tool to convert a vcf in mutational catalog. I did it myself. I start with the output vcf from my SNPs caller. I first isolate the SNPs and get the trinucleotide context for each SNPs. Then I make a file with 3 columns: Trinucleotide REF ALT. From that file, I remove indels then convert purines bases in pyrimidines because the tool take care only of the pyrimidines (well explained here). After that, I compute the number of each SNPs in each trinucleotide context and then it's a matter of formatting the file to have a proper mutational catalog for the tool.
Thank you @Marozi2 for your reply. This sounds like a big task. I am still in search of a way to transform my vcf file either into the acceptable format which SigProfilerExtractor accepts or transform it into mutational catalogue. If I dont find any solution, will just follow what you did.
Hi @Marozi2 and @Sonia-Bedi,
You can generate a mutational catalog using your preferred reference genome by using SigProfilerMatrixGenerator. You can find the details in the corresponding README, as well as in this useful Wiki page.
On the other hand, regarding the refitting analysis with custom signature files that you mentioned @Marozi2, I would like to have a closer look into it. Could you please share the signatures file that you are using? Also, is this bug happening using both vcfs and a pandas dataframe for the input mutational catalog matrix?
Thanks a lot for your interest in our tool and sorry for the late reply.
Best, Marcos
Hello,
Sorry for the late response, I was away last two weeks.
Here is the file (signatures_forSigProfiler.txt) I use for refitting on 2013 signatures. I made it from this file (signatures.txt) I took from here ftp://ftp.sanger.ac.uk/pub/cancer/AlexandrovEtAl.
The bug happens when I give a vcf as input but it does not with a mutational catalog.
Hello @Marozi2,
Please use the option check_rules=False
. That should solve your issue for now. Please let me know if that's not the case or if you have any other questions.
We are currently working on a major upgrade of the tool, where we will address this issue for sure. Thanks for letting us know.
By the way, you should download all the different versions of COSMIC signatures always from https://cancer.sanger.ac.uk/signatures/downloads/. Your signatures.txt
matrix can lead to some issues since the rows have been reordered in the newest SigProfiler versions. Please check carefully your results in this regard.
Hope that helps and thanks again for your interest!
Hello @marcos-diazg
Indeed it works with the option check_rules=False
. Could you explain what does this option implies?
Yes I noticed that signatures.txt
was not correctly formatted, that's why I created signatures_ForSigProfiler.txt
.
Thank you for your help!
The check_rules
option controls the application of the biological-based rules in the signature assignment process (as described in Extended Data Fig. 8b from Alexandrov et al. 2020 Nature). These rules are based on the COSMIC v3 reference mutational signatures, described in the same study and used as default by SigProfilerSingleSample. If you use a custom set of reference signatures, these rules cannot be used. As I mentioned, we are working on a major upgrade of the tool that will take this into account.
Happy to help and please reopen the issue if you have any other problems. Thanks!
Hi,
I'm currently using SigProfilerSingleSample to refit mutational catalog from my samples against signatures from 2013. I'm now trying to do the same but with a VCF instead of a mutational catalog. I'm able to do it when I refit against the signatures of 2020, so the ones used by default by the tool.
My problem is, the tool stops after the creation of
decomposition profile.csv
if I try to refit the VCF against the signatures from 2013. Thedecomposition profile.csv
is empty as well as the.err
file.Is it a bug or is it not possible to use SigProfilerSingleSample with a VCF to refit against other signatures than the ones used by default?