As discussed the separate PR for the AVL Tree implementation/suggestion. I've made some changes to BST tree implementation as well as added the AVL tree which "sits" on top of the BST.
Since the changes in the BST affect its general structure I'd love to to get some feedback.
The follwing gives a high level description about the changes. I think the details are best discussed at the specific code lines...
As mentioned please regard the current implementation as suggestion/MVP and basis for further improvements/extensions
General
As already hinted at in the intro the AVL tree leverages the already existing BST (which in my opinion kind of makes sense since lots of operation can be reused).
For the AVL tree to work some changes were required in the BST:
Struct for BST nodes
Comparable protocol for complex node values
Hooks for the new, insert and delete operation
BST Node
The bst nodes consist of:
value: Does now accept any kind of value types (like structs, ints...)
left: Left child
right: Right child
augmentation: This is to allow the BST to be used as base for other kind of tree structures (like AVL or Red-Black trees ;)). The idea is to use that field as means for storing required tree strcuture values (e.g. the AVL stores the balance factor as well as the height as separate struct in that field). The field is set during the various tree operations (like insert, delete...). To keep it focused only tree structure values should be stored there. Augmentations belonging to the stored value itself belong into the struct stored in the value field.
The idea regaring the augmentation is/was somehow related to the general procedure for augmenting datastructures (as for example described in CLRS Chapter 14 3rd edition).
Comparable protocol
For the BST to work with any kind of data structure as value (e.g. structs as well as "simple" types a BST Comparable protocol was introduced that the value types stored in the BST are required to implement (as described in this google discussion there doesn't seem to be a "generic" comparable protocol which can be implemented).
The BST already contains an implementation/fallback for the Any type (falling back to Kernel).
Hook methods
To avoid traversing the complete tree again for adding the AVL augmentations (balance factor and height) some kind of hooks were required to directly perform these calculations during BST new, insert and delete.
These have been named processors and currently the BST supports a pre- and post processor.
The pre processor is performed before the tree operation on the BST node (the new operation does not support that operation since the node does not yet exist)
The post processor is executed after the tree operation on the BST node (and is supported by all operations changing the tree structure)
For the pre- as well as the postprocessor a standard identity processor is executed in case nothing is provided.
I made some thoughts whether to use a dedicated module or function keyword list. I decided for the latter as it somehow seems to be the elixir approach (but as I'm kind of new to elixir I could also be mistaken regaring that).
AVL Tree
The AVL tree implementation leverages the described BST extensions for performing the required operations.
Augmentation
It introduces an augmentation struct:
height: Stores the height of the node in the tree staring a zero (for leaf nodes)
bf: balance factor of the node which must be <= 1 for the AVL tree to be balanced. It is calculated left.height - right.height (however the ordering makes now difference)
Postprocessor
A postprocess is added which performs the required rotations as well as node augmentations.
augmentation: Calculates the augmentation values
rotation: Performs the required rotations depending on whether the tree is left, right, left-right or right-left heavy
After every rotation the augmentation values are "updated".
The summarized steps are:
Perform the standard BST operation (like insert or delete)
For all nodes on the path to the node (implicitly done by the BST operation)
Check if the node is balanced
Rotate if required
Augment
Although in the case of insert it would be sufficient to just check the parant of the newly inserted node, the immutable character of elixir/erlang required all nodes on the path from the root to the affected node to be traversed.
Tests
Tests for the BST updates as well as the AVL tree operations have been added.
The test dedicated to the AVL tree should cover the different rotation cases (please let me know in case I've missed cases).
Other
I had some trouble to get the dialyzer going through. It seems that it has issues with structs as well as protocols. Therefore a dialyzer.ignore-warnings file has been added to get a green build again ;). I'd love to hear if someone knows how to avoid these warnings.
I've found a elixir talk regarding that issue and it looks like this is somehow a bug. However the talk is from 2014 so maybe there have been some changes regarding that.
Also worth mentioning is that I do not get the issue on my local machine, it just appears during the travis build.
The displayed credo issue is just a warning regarding the @spec definition for the greater- and less implementations for the Comparable protocol.
In case I have missed anything else please let me know ;)
AVL Tree
As discussed the separate PR for the AVL Tree implementation/suggestion. I've made some changes to BST tree implementation as well as added the AVL tree which "sits" on top of the BST. Since the changes in the BST affect its general structure I'd love to to get some feedback.
The follwing gives a high level description about the changes. I think the details are best discussed at the specific code lines...
As mentioned please regard the current implementation as suggestion/
MVP
and basis for further improvements/extensionsGeneral
As already hinted at in the intro the AVL tree leverages the already existing BST (which in my opinion kind of makes sense since lots of operation can be reused).
For the AVL tree to work some changes were required in the BST:
BST Node
The bst nodes consist of:
The idea regaring the augmentation is/was somehow related to the general procedure for augmenting datastructures (as for example described in CLRS Chapter 14 3rd edition).
Comparable protocol
For the BST to work with any kind of data structure as value (e.g. structs as well as "simple" types a BST Comparable protocol was introduced that the value types stored in the BST are required to implement (as described in this google discussion there doesn't seem to be a "generic" comparable protocol which can be implemented).
The BST already contains an implementation/fallback for the
Any
type (falling back to Kernel).Hook methods
To avoid traversing the complete tree again for adding the AVL augmentations (
balance factor
andheight
) some kind of hooks were required to directly perform these calculations during BSTnew
,insert
anddelete
.These have been named
processors
and currently the BST supports a pre- and post processor.new
operation does not support that operation since the node does not yet exist)For the pre- as well as the postprocessor a standard identity processor is executed in case nothing is provided.
I made some thoughts whether to use a dedicated module or function keyword list. I decided for the latter as it somehow seems to be the elixir approach (but as I'm kind of new to elixir I could also be mistaken regaring that).
AVL Tree
The AVL tree implementation leverages the described BST extensions for performing the required operations.
Augmentation
It introduces an augmentation struct:
left.height - right.height
(however the ordering makes now difference)Postprocessor
A postprocess is added which performs the required rotations as well as node augmentations.
After every rotation the augmentation values are "updated". The summarized steps are:
Although in the case of insert it would be sufficient to just check the parant of the newly inserted node, the immutable character of elixir/erlang required all nodes on the path from the root to the affected node to be traversed.
Tests
Tests for the BST updates as well as the AVL tree operations have been added.
The test dedicated to the AVL tree should cover the different rotation cases (please let me know in case I've missed cases).
Other
I had some trouble to get the dialyzer going through. It seems that it has issues with structs as well as protocols. Therefore a
dialyzer.ignore-warnings
file has been added to get a green build again ;). I'd love to hear if someone knows how to avoid these warnings.I've found a elixir talk regarding that issue and it looks like this is somehow a bug. However the talk is from 2014 so maybe there have been some changes regarding that.
Also worth mentioning is that I do not get the issue on my local machine, it just appears during the travis build.
The displayed credo issue is just a warning regarding the
@spec
definition for the greater- and less implementations for theComparable
protocol.In case I have missed anything else please let me know ;)