TheEvergreenStateCollege / bioinformatics

Plant genome sequencing
2 stars 2 forks source link

Continue explaining Ukkonen's algorithm #34

Closed learner-long-life closed 3 months ago

learner-long-life commented 3 months ago

This is a problem to help us as a community understand suffix trees for genomic analysis, in particular, those constructed with Ukkonen's algorithm.

Pre-requisites

  1. Read through and understand this stackoverflow explanation of Ukkonen's algorithm to construct a suffix tree.

https://stackoverflow.com/a/9513423

  1. Use of the Ukkonen visualizer, and taking screenshots at various stages

https://brenden.github.io/ukkonen-animation/

The Task

Use your understanding to extend (write an appendix to) the answer above, but this time for adding a new character e, in Markdown, as a response to this issue.

Paste screenshots of the Ukkonen visualizer, at appropriate points in your explanation to make your points.

That is, your final suffix tree should be for the input string abcabxabcde.

Explain where appropriate when operations can be done in constant time because of the auxiliary data kept around the tree (active length, active edge, remainder, etc.)

learner-long-life commented 3 months ago

Done, this has been added to the wiki page

https://github.com/TheEvergreenStateCollege/smarty-plants/wiki/Suffix-Tree-Using-Ukkonen's-Algorithm

Thanks @Lex-Med