TheEvergreenStateCollege / smarty-plants

Plant genome sequencing
2 stars 2 forks source link

Select optimal portion of suffix tree to offload to database #33

Open learner-long-life opened 2 months ago

learner-long-life commented 2 months ago

When writing part of a suffix tree to a database, how can we select the optimal subtree to offload into a database, such that future deferred operations are balanced with the operations that performed on other parts of the suffix tree (parent tree, sibling subtree)

Grapevine Hypothesis

This assumes that the "Grapevine Hypothesis" is true.

If we prune a sub-vine off the parent vine, and then continue growing it, when we graft it back to the parent vine, the entire combined vine will look exactly as if the sub-vine had never been pruned. This is true if the sub-vine doesn't need any information from "higher up" the vine, its parent or siblings, to decide how to grow.

learner-long-life commented 2 months ago

@AbyssalRemark feedback is welcome on this description of the problem, and a general scheme for dividing up a suffix tree to update it in pieces, to fit in memory, that is still eventually consistent and correct.