Closed MikeACG closed 10 months ago
UPDATE: well actually the program did not complain of missing simulations anymore but it encountered an error, here is the log:
multiprocessing.pool.RemoteTraceback:
"""
Traceback (most recent call last):
File "/usr/lib/python3.8/multiprocessing/pool.py", line 125, in worker
result = (True, func(*args, **kwds))
File "/home/mike/.local/lib/python3.8/site-packages/SigProfilerClusters/hotspot.py", line 707, in calculateSampleIMDs
regions = densityCorrection(densityMuts, densityMutsSim, windowSize)
File "/home/mike/.local/lib/python3.8/site-packages/SigProfilerClusters/hotspot.py", line 544, in densityCorrection
sims = random.sample(list(densityMutsSim.keys()), 10)
File "/usr/lib/python3.8/random.py", line 363, in sample
raise ValueError("Sample larger than population or is negative")
ValueError: Sample larger than population or is negative
"""
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/home/mike/.local/lib/python3.8/site-packages/SigProfilerClusters/SigProfilerClusters.py", line 669, in analysis
regions, imds = hotspot.hotSpotAnalysis(project, genome, contexts, simContext, ref_dir, windowSize, processors, plotIMDfigure, exome, chromLengths, binsDensity, original, signature, percentage, firstRun, clustering_vaf, calculateIMD, chrom_based, correction)
File "/home/mike/.local/lib/python3.8/site-packages/SigProfilerClusters/hotspot.py", line 1059, in hotSpotAnalysis
r.get()
File "/usr/lib/python3.8/multiprocessing/pool.py", line 771, in get
raise self._value
ValueError: Sample larger than population or is negative
The call was:
hp.analysis("BRCA_sigprofclust", "GRCh37", "96", ["288"], "../producedData/BRCA_sigprofclust/", correction=True, includedVAFs=False, exome=True)
Maybe the program is indeed untested for custom.range simulations and that is why the exome argument was omitted on purpose in the documentation?
Hi @MikeACG ,
Thanks for reaching out. Currently, we are working at this issue and will let you know the update soon.
Best, Mousumy
Hi MikeACG,
Apologies for the delay! Currently, the clusters tool does not support custom bed files; however, as long as the simulated files have the proper "exome" suffix included, in theory this will work.
The newest error that you reported had to do with running the cluster tool with exome=True and correction=True togeher. We have updated the tool to fix this issue (v1.1.2). I suggest upgrading your package and rerunning your current analysis. I will close this issue, but please reopen if you are still experiencing issues.
Best, Erik
Hey there, I recently was attempting to run the program for some SigProfilerSimulator simulations I created providing a custom BED file in the
exome
parameter. When attempting to run SigProfilerClusters on those simulations, the program would complain that there were no simulations for the project. After digging around in the code, I found that the main function has an argumentexome
as well (default false) that is not described in the wiki documentation. From what I saw, this argument is only used to know if "_exome" should be appended to the path where the program expects to find the simulations from SigProfilerSimulator. Setting this to true in the call to the program made it work finally. I was wondering if this was just left out of the documentation by mistake or if actually SigProfilerClusters is currently not meant to be run for custom-range simulations. Another possibility is that the program is supposed to automatically detect that the simulation was custom-range but its not actually doing so currently.