tskit-dev / pyslim

Tools for dealing with tree sequences coming to and from SLiM.
MIT License
26 stars 23 forks source link

Documentation: confusion between pedigree and genetic ancestry? #126

Open hyanwong opened 3 years ago

hyanwong commented 3 years ago

In the intro part of the documentation, lots of the explanation about removing individuals discusses the pedigreegenealogy. But I usually take the word genealogy to mean the pedigree rather than the genetic ancestry. E.g. if the example were to be simulated in SLiM with a 0 recombination rate, wouldn't a lot more of the nodes be removed?. for this reason, I find find it a bit odd to use the term "genealogical", and the diagram not entirely clear on the distinction.

hyanwong commented 3 years ago

For example, instead of

SLiM can read and write tree sequences, which store genealogical data of entire populations. These can be used to efficiently store both the state of the population at various points during a simulation as well as its genealogical history. Furthermore, SLiM can “load” a saved tree sequence file to recreate the exact state of the population at the time it was saved. To do this, SLiM has added several additional types of information to the basic tree sequence file.

could we say

SLiM can read and write tree sequences, which store the genetic ancestry of entire populations. These can be used to efficiently store both the state of the population at various points during a simulation as well as its full genetic history. Furthermore, SLiM can “load” a saved tree sequence file to recreate the exact state of the population at the time it was saved. To do this, SLiM has added several additional types of information to the basic tree sequence file.

petrelharp commented 3 years ago

Good point, but saying "genetic history" doesn't necessarily connote that we're recording actual genealogical relationships (although not necessarily all of them, as you say). For instance, a new arrival might read "genetic history" to mean that, like, we're recording which of some ancestral populations everyone is descended from. That's why I've gone with "genealogical", as I think its more precise as to what we're actually doing. (And, as written it says it stores "genealogical data", but does not say it stores all the genealogical data.)

I'm sure I've not said things in the best possible way, so suggestions welcome! But this one isn't doing it for me...

hyanwong commented 3 years ago

It's all tricky stuff! I guess we could also use "inheritance" or something. Or genetic (or "gene") genealogy? Perhaps we should put it in a google doc and then we can both edit it?