popsim-consortium / stdpopsim

A library of standard population genetic models
GNU General Public License v3.0
122 stars 87 forks source link

Error with generic `StrAga` contig in msprime engine, due to average GC tract length #1405

Open nspope opened 1 year ago

nspope commented 1 year ago

With:

import stdpopsim
sp = stdpopsim.get_species("StrAga")
contig = sp.get_contig(length=100000)
engine = stdpopsim.get_engine("msprime")
model = stdpopsim.PiecewiseConstantSize(sp.population_size)
engine.simulate(model, contig, {"pop_0":2}) 

I get:

Traceback (most recent call last):
  File "/home/natep/projects/stdpopsim/ploidy-pop-size.py", line 6, in <module>
    engine.simulate(model, contig, {"pop_0":2})
  File "/home/natep/.miniconda3/envs/eucpru-sim/lib/python3.10/site-packages/stdpopsim/engines.py", line 286, in simulate
    ts = msprime.sim_ancestry(
  File "/home/natep/.miniconda3/envs/eucpru-sim/lib/python3.10/site-packages/msprime/ancestry.py", line 1207, in sim_ancestry
    sim = _parse_sim_ancestry(
  File "/home/natep/.miniconda3/envs/eucpru-sim/lib/python3.10/site-packages/msprime/ancestry.py", line 1010, in _parse_sim_ancestry
    return Simulator(
  File "/home/natep/.miniconda3/envs/eucpru-sim/lib/python3.10/site-packages/msprime/ancestry.py", line 1318, in __init__
    super().__init__(
msprime._msprime.InputError: Input error in set_gene_conversion_tract_length: Bad parameter value provided

It works fine with contig = sp.get_contig("1") -- so may have to do with how mean GC tract length is calculated from per-chromosome GC tract lengths? But, the snippet above works fine for generic EscCol contigs, which should involve very similar calculations ...

petrelharp commented 1 year ago

It's because the GC length for StrAga is 120000; so a contig length shorter than this throws an error. The error should be more informative, however.

nspope commented 1 year ago

Oh that makes sense. I'll leave this open as a reminder.