popsim-consortium / stdpopsim

A library of standard population genetic models
GNU General Public License v3.0
124 stars 87 forks source link

add optional species, genome, and demographic features to stdpopsim #848

Open fbaumdicker opened 3 years ago

fbaumdicker commented 3 years ago

Some aspects of species and genomes can be simulated in slim that are not captured in msprime, but it would still be reasonable to simulate the species in both simulators.

A concrete example we have in mind is the circular genome of bacteria. In msprime we just ignore this such that the evolution at the borders of the genome is not connected to each other. In slim this can be included.

It might be nice to have a way in stdpopsim to define such genome properties at the species level and maybe also at the demographic model level? In msprime the optional parameter would be ignored , while in slim it would enable the corresponding flag/parameter of the simulation.

There might also be examples in the other direction, where a feature of msprime is not useful/possible in slim.

jeromekelleher commented 3 years ago

I guess one thing we could do is attach some attributes to each of the chromosomes, which could cover things like sex chromosomes, circularity, etc. One thing we could do is have a flags value for each chr, and have some bitwise flags corresponding to the various properties of a chromosome that we might have (say AUTOSOME, SEX_X, CIRCULAR,....). This could be declared in the species.py file.

Or have I been writing too much C, and we should add a bunch of boolean attributes to the Chromosome class?

petrelharp commented 3 years ago

Or have I been writing too much C, and we should add a bunch of boolean attributes to the Chromosome class?

Or maybe an .attributes dict that could serve the same purpose as .flags but be more readable?

jeromekelleher commented 3 years ago

Yeah, if we could restrict the values in that dict to a fixed set of keys.

petrelharp commented 3 years ago

Another thing we'd like to add is e.g. "selfing rate", which would be a number.

petrelharp commented 3 years ago

As discussed with @jeanrjc, for bacteria we need:

petrelharp commented 3 years ago

As mentioned in #861, we should add ploidy.

petrelharp commented 3 years ago

Attributes of Species are attributes, and can be optional (at least, they can have defaults). So, I propose just adding more attributes, with defaults. Here's a proposal:

and for Chromosome:

The main question is what to do with things that aren't implemented - for instance, if someone tries to simulate a circular chromosome, do we... ignore this flag? Throw a warning? Or, maybe we wait to add options until the functionality in the engines is also added? Most of these will be very easy to add in SLiM - they are just initialization functions (and, I think we should structure these to resemble SLiM's interface closely...).

petrelharp commented 3 years ago

I think we also really need a description field. For instance, see these notes.

apragsdale commented 3 years ago

I think we also really need a description field. For instance, see these notes.

Agreed - having those as a description instead of comments would be helpful (and less liable to be accidentally deleted over time)

grahamgower commented 3 years ago
  • type (default="A" or "autosome", other options="X", "Y", "Z", "W", "mitochondria", "chloroplast"... ?)

I think we might need to break this down a bit more, to consider the functionality required for these various types, rather than just having their names. Eg. discussion of sex chromsomes in #383 suggests X needs a parameter p (male/female ratio), in order to adjust Ne. Y, Mitochondria and chloroplast might simply need a ploidy parameter.

Similarly, I think the reproduction_mode might need to be broken down by the required functionality.

jeromekelleher commented 3 years ago

So there's a lot of things being proposed at once here, ranging from the fairly simple and uncontroversial (Species.ploidy) to really quite hard and complicated (chromosome type). How about we break this into a separate issues for each, so that we can address them independently?

fbaumdicker commented 3 years ago

Yes it makes sense to discuss the more complicated stuff separately. But I think adding a description field might be a simple first solution for many of these issues.