Open czgdp1807 opened 4 years ago
Hi, can I start working on this issue as a part of RGSoC20?
Hi, @vibhu18116 Please start by designing an API for m-ary trees. Please let us know if you face any problem/queries.
Thank you for allowing me to work on this.
I believe the API design would closely follow that of binary trees. Hence these are a few APIs I think should form a basis:
__new__(cls, m_value = 2, array_flag = 0, key=None, root_data=None, comp=None)
This would initialise a new m-ary tree. m_value will be used to determine the maximum value of m (the number of children of a node), default to 2 (binary tree in that case). array_flag can be set to indicate that an array implementation is required and it can remain 0 for pointer implementation.
Rest of the parameters will be same as binary tree implementation.
insert(self, key, data)
delete(self, key, **kwargs)
search(self, key, **kwargs)
__str__(self)
The above four will behave in the same way as in the binary tree. Implementation of the __str__
function will vary depending upon if the tree is implemented through array or pointers.
height(self)
Will return the height of the tree.
width(self)
Will return the width of the tree.
convert_to_binary(self, array_flag = 0)
This will convert an m-ary tree to binary tree and return the pointer to root if array_flag is not set and the array if array_flag is set.
Kindly check if these APIs work.
__new__(cls, m_value = 2, array_flag = 0, key=None, root_data=None, comp=None)
IMO, using only array based implementation will be better here as pointers are somewhat unsafe, so avoid them where ever possible. Instead of m_value
, a better name would be max_children
. This suggests a nice way to implement using arrays.
Can you please explain the use of the values returned by height
and width
methods?
convert_to_binary(self, array_flag = 0)
IMO, just to_binary_tree(self)
would better.
You may refer, https://github.com/codezonediitj/pydatastructs/blob/master/pydatastructs/trees/binary_trees.py while implementing the above APIs.
Note that m-ary tree is not a search tree, so we don't need to worry about the manner in which nodes are stored. We need a creative way to avoid the exponential space complexity of array based implementation. May be a new type of node inheriting TreeNode
can be made which will be having an array storing the indices of it's children node in the array of m-ary trees. Please let me know if you are unable to get my point.
I thought height
and width
of a tree can provide additional functionality to the user, however other than from feature point of view, I don't find any suitable use case.
So, if you suggest I would drop them.
I would update the names of functions as suggested.
You may refer, https://github.com/codezonediitj/pydatastructs/blob/master/pydatastructs/trees/binary_trees.py while implementing the above APIs.
I checked this implementation, but a lot of functions like insert, delete are not implemented, probably because they depend on comparision key. Since the insert, delete API would follow the same pattern, should I first implement them?
Actually, binary trees are quite abstract, hence their insert and delete operations are left unimplemented. What logic should be followed while making insertions to a binary tree? Both the operations are well defined for binary search trees. IMO, we can keep insert, delete operations undefined for m-ary trees. We can a complementary m-ary search tree inheriting m-ary tree with insert and delete well defined for them. Here is defined for these kind of trees. You can do the work in two separate PRs for m-ary and m-ary search trees, one after the previous one.
I get your point. I will try to come up with actual implementation and try to create a PR in a day or two for m-ary tree.
Can I keep to_binary_tree
method abstract in m-ary tree, or a concrete implementation is required?
If a concrete implementation is required, then should I modify the m-ary tree itself or create a new binary tree with nodes of type TreeNode?
Currently, I have made a new node type extending TreeNode as suggested earlier which form nodes of m-ary tree.
Can I keep
to_binary_tree
method abstract in m-ary tree, or a concrete implementation is required?
We can keep it as abstraction in m-ary tree but a concrete implementation can be added in m-ary search tree.
If a concrete implementation is required, then should I modify the m-ary tree itself or create a new binary tree with nodes of type TreeNode?
IMO, a new tree should returned.
I created a PR. However, the Travis checks failed due to some trailing whitespaces. Can you please guide how do I ensure there are no trailing white spaces?
Also can you review the code once to see if it is the desired implementation?
It's time to make a PR for MArySearchTree
and the relevant tests.
Okay. I will start work on it soon :)
And I hope you are working on your RGSoC application. The deadline is 30th March according to the latest information I have. See, https://railsgirlssummerofcode.org/students/application/#apply
Yes. I will start working on application as well.
It's time to make a PR for
MArySearchTree
and the relevant tests.
I was looking through MArySearchTree on the link you referenced earlier.
However, I don't see a determinate way of insertion.
Say, if the nodes can have 4 children, and root has currently 3 only as a data item, then on trying to insert 5, it can become a part of the root, or go in a child node to the right of 3.
Can you clarify this point once?
I see, the current implementation in m_ary_trees.py
doesn't align with the description given here. It should contain a group of keys but it just contains a single key. A minor update is needed on it.
I was looking through MArySearchTree on the link you referenced earlier.
However, I don't see a determinate way of insertion.
Some research is needed.
I see, the current implementation in
m_ary_trees.py
doesn't align with the description given here. It should contain a group of keys but it just contains a single key. A minor update is needed on it.
I think adding a method to insert a value in the node in MAryTreeNode
should work. The value will be added if currently the number of values in the node are less than (max_children - 1), else, error can be thrown.
Does this work?
I think adding a method to insert a value in the node in
MAryTreeNode
should work.
How it will help in having multiple keys in the node?
So, the node can store the (data, key) as a tuple in a dynamically sized array which can grow up to (m-1) size where m is the maximum number of children a node can have.
The information about the children nodes is already stored in the children array.
Here's what I have thought of,
MAryTreeNode
SinglyLinkedList
with key, data in LinkedListNode
. The framework for linked lists is in linear_data_structures
. We can keep array as well but for consistency, tuple
might not be a good idea instead, I'd suggest to factor out the key
and data
part to the abstract node class as almost every node has these two members and then just use the abstract Node class as common type for array of keys.key_list[i]
and key_list[i-1]
.The children
array is already there so only 1
needs to be added.
This works. But the array could have been subjected to binary search to search a particular element in that node. However, this won't be possible in the linked list implementation. So, what is your opinion about this?
But the array could have been subjected to binary search to search a particular element in that node.
Well, the maximum size of the key_list won't be very large so, O(log(m)) or O(m) will not make much of a difference. Over optimisation can be avoided here. If m
is large then MAryTree
will not be of any practical use.
If
m
is large thenMAryTree
will not be of any practical use.
Though if you are aware of any use case where m
is very large then we can think of using arrays.
I didn't find any specifics about the node size in different applications. So, if you say, I can proceed with a linked list.
Sure, feel free to proceed.
@czgdp1807 I hope this tree has already been implemented why not close this one!
Description of the problem
Adding m-ary trees will be a nice idea. May be a array based(allows printing the trees) or pointer(better space utilisation) or both would be better.
Example of the problem
References/Other comments
.. [1] https://en.wikipedia.org/wiki/M-ary_tree