gtsambos / 2022-ts-workshops

Notebooks for tree sequence workshops in 2022.
6 stars 1 forks source link

Phrasing in first notebook, re alleles #1

Open hyanwong opened 2 years ago

hyanwong commented 2 years ago

You say

The tree consists of nodes, which represent the alleles held by different chromosomes in the history of the sample, and edges, which represent genealogical relationships between the alleles.

But I think it might be read as the nodes representing alleles (which they don't, of course). We normally say that the nodes represent genomes, I think? The genomes can carry mutations, which result in alleles being observed.

hyanwong commented 2 years ago

Similarly for other mentions of "alleles" in that notebook, I would say, e.g. "Suppose we have a sample of 4 alleles" - I think you mean genomes here. I could see the current phrasing as meaning 4 alleles at a single site.

gtsambos commented 2 years ago

Thanks @hyanwong! I'm trying to introduce them first to the idea of a tree at a single point in the genome, would 'nucleotide' be a better choice? (Though I know the sites can be other things, I don't want to introduce that level of abstraction just yet) Then after that I basically say "this single tree might also represent the genealogy at some neighbouring sites, but at some point you'll move along the genome and see a different tree, and that's why it is a tree sequence"

On Thu, 9 Jun 2022, 5:16 pm Yan Wong, @.***> wrote:

Similarly for other mentions of "alleles" in that notebook, I would say, e.g. "Suppose we have a sample of 4 alleles" - I think you mean genomes here. I could see the current phrasing as meaning 4 alleles at a single site.

— Reply to this email directly, view it on GitHub https://github.com/gtsambos/2022-ts-workshops/issues/1#issuecomment-1151256419, or unsubscribe https://github.com/notifications/unsubscribe-auth/AEHOXQTOVWVYIEVKF4H4NN3VOIDELANCNFSM5YKOI6YQ . You are receiving this because you are subscribed to this thread.Message ID: @.***>

hyanwong commented 2 years ago

Hmm, well an allele is something different, right? Two sample nodes can contain the same allele at a locus. The node is not an allele itself.

Could you just say " the nodes represent genomes or regions of genomes, here we've picked a short region and so the relationship is just a simple tree" - or something?

gtsambos commented 2 years ago

Yes, I guess I could say "a region spanned by a single nucleotide base"

hyanwong commented 2 years ago

Or "a short region of genome, perhaps just a single base"?