AlexandrovLab / SigProfilerSimulator

SigProfilerSimulator allows realistic simulations of mutational patterns and mutational signatures in cancer genomes. The tool can be used to simulate signatures of single point mutations, double point mutations, and insertion/deletions. Further, the tool makes use of SigProfilerMatrixGenerator and SigProfilerPlotting.
BSD 2-Clause "Simplified" License
18 stars 4 forks source link

Chromosome based error #6

Closed cutleraging closed 1 month ago

cutleraging commented 10 months ago

Hello,

I am running your tool using:

sigSim.SigProfilerSimulator(name, \
  vcf_dir, \
  "GRCh37", \
  contexts=["96", "ID"], \
  exome=None, \
  simulations=1000, \
  updating=False, \
  bed_file=bed, \
  overlap=False, \
  gender='female', \
  chrom_based=True, \
  seed_file=None, \
  noisePoisson=False, \
  cushion=100, \
  region=None, \
  vcf=True)

But I am getting the following error:

multiprocessing.pool.RemoteTraceback: 
"""
Traceback (most recent call last):
  File "/gs/gsfs0/users/rcutler/.conda/envs/SigProfilerSimulator/lib/python3.12/multiprocessing/pool.py", line 125, in worker
    result = (True, func(*args, **kwds))
                    ^^^^^^^^^^^^^^^^^^^
  File "/gs/gsfs0/users/rcutler/.conda/envs/SigProfilerSimulator/lib/python3.12/site-packages/SigProfilerSimulator/mutational_simulator.py", line 973, in simulator
    random_sample = random.sample(list(mutation_tracker[context]),1)[0]
                    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/gs/gsfs0/users/rcutler/.conda/envs/SigProfilerSimulator/lib/python3.12/random.py", line 430, in sample
    raise ValueError("Sample larger than population or is negative")
ValueError: Sample larger than population or is negative
"""

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/gs/gsfs0/shared-lab/vijg-lab/2023-Ronnie/231009_multiple_ENU_analysis/SigProfilerSimulator/merged/runSigProfilerSimulator.py", line 10, in <module>
    sigSim.SigProfilerSimulator(name, \
  File "/gs/gsfs0/users/rcutler/.conda/envs/SigProfilerSimulator/lib/python3.12/site-packages/SigProfilerSimulator/SigProfilerSimulator.py", line 479, in SigProfilerSimulator
    r.get()
  File "/gs/gsfs0/users/rcutler/.conda/envs/SigProfilerSimulator/lib/python3.12/multiprocessing/pool.py", line 774, in get
    raise self._value
ValueError: Sample larger than population or is negative

When I run your tool with chrom_based=False I am able to get results. So this makes me think it is an error when wanting to have mutations simulated by chromosome. Since some of my samples don't have many mutations, I think this may be due to some chromosomes having 0 mutations. Any help with this?

Thanks, Ronnie

mdbarnesUCSD commented 10 months ago

Hi Ronnie,

Could you please send over an input file to reproduce this issue?

Thanks!