tskit-dev / tutorials

A set of tutorials for msprime and tskit.
Creative Commons Attribution 4.0 International
18 stars 15 forks source link

Tutorial on edge_diffs / incremental algorithms #233

Open hyanwong opened 1 year ago

hyanwong commented 1 year ago

There's a lot of idioms and complicated code when implementing incremental algorithms, for example, that use the edge_diffs() iterator. For example it's a common idiom to iterate jointly over trees and edges diffs:

for tree, diffs in zip(ts.trees(), ts.edge_diffs()):
    ...

When going through the edge diffs, it is also often useful to keep track of which nodes come in and out of the tree (and hence maintain a list of "active" nodes). For example, I think this is what I need for https://github.com/tskit-dev/tskit/discussions/2718. I've done this sort of thing before, but forgotten the code that I used to do it. A tutorial might be a good place to put example code for people to modify. I assume it would be linked to from the "Fundamental operations" tute mentioned in #203

hyanwong commented 1 year ago

Ah yes, I remember, this sort of tallying of nodes when passing over edge_diffs (and in this case, counting sample numbers under them), was what I did when implementing the spans_by_samples function in tsdate.