caesar0301 / treelib

An efficient implementation of tree data structure in python 2/3.
http://treelib.readthedocs.io/en/latest/
Other
801 stars 185 forks source link

How to order by data? #65

Open daphnawegner opened 7 years ago

daphnawegner commented 7 years ago

Is there a good way to order the dictionary by a property found in the data properties?

I am trying to order the children of each node according to the volume field in the data dict

 "Root": {
            "data": {
                "volume": 88829.0,
                "display_name": "Root",
            },
            "children": [
                {
                    "softw": {
                        "data": {
                            "volume": 82229.0,
                            "display_name": "software",
                        },
                        "children": [
                            {
                                "best": {
                                    "data": {
                                        "volume": 139.0,
                                        "display_name": "best",
                                    },
                                    "children": [
                                        {
                                            "songwrit_139": {
                                                "data": {
                                                    "volume": 139.0,
                                                    "display_name": "songwrite", 
                                                }
                                            }
                                        }
                                    ]
                                }
                            },
daphnawegner commented 7 years ago

I have tried doing something like this but it didn't change the order of the nodes in the tree:

def sort_tree(tree,node):
    if len(tree.children(node.identifier)) == 0:
        return None
    else:
        children = tree.children(node.identifier)
        for child in children:
            children.sort(key=lambda node: node.data["volume"], reverse=True)
            sort_tree(tree,child)

    return None
caesar0301 commented 7 years ago

You know the tree nodes are stored as Python dict, which only reserves hashed order. It cann't be ordered on the tree per se. I think there is a small mistake in ur code. children is a local variable as a list returned by tree.children. Its sorted and then dropped before entering next iteration.

caesar0301 commented 7 years ago

A potential solution. Node children are stored as a list in its _fpointer. You can implement a tree.sort_children() to get sorted _fpointer permanently.

daphnawegner commented 7 years ago

yes i see what you mean but doing this doesn't sort either

def sort_tree(tree,node):
    if len(tree.children(node.identifier)) == 0:
        return None
    else:
        children = tree.children(node.identifier)
        for child in children:
            tree.children(node.identifier).sort(key=lambda node: node.data["volume"], reverse=True)
            sort_tree(tree,child)

    return None

I am actually entering the nodes in the order I want them to be but for some reason when I print the tree I see its alphabetically ordered, is that intentional?

BebeSparkelSparkel commented 7 years ago

@daphnawegner could you please explain why you are sorting the tree. Do you just want the children sorted when you reference them? Or, is it more complicated than that?

daphnawegner commented 7 years ago

I need to pass a sorted json representation of the tree

BebeSparkelSparkel commented 7 years ago

Since you only need a json representation you probably don't need to sort the actual tree and only need to sort the json dictionary.

def sort_tree(tree_dict, key_value_function, reverse=None):

  def child_key_value(child_tag):
    if isinstance(child_tag, dict): child_tag = next(iter(child_tag.keys()))
    return key_value_function(next(tree.filter_nodes(lambda node: node.tag == child_tag)))

  if isinstance(tree_dict, dict):
    tree_dict[next(iter(tree_dict.keys()))]['children'].sort(key=child_key_value, reverse=reverse)
    for child in tree_dict[next(iter(tree_dict.keys()))]['children']:
      sort_tree(key_value_function, key_value_function)

  return tree_dict

You can then create your sort function lambda node: node.data['volume'].
Lastly, just take the sorted dictionary object and convert it to a json:

import json
json.dumps(sort_tree(
     tree.to_dict(),
    lambda node: node.data['volume']
))

Here is the test file that I created this in if you want it:

from treelib import Node, Tree
from pprint import pprint
import json

tree = Tree()
tree.create_node("Harry", "harry")  # root node
tree.create_node("Jane", "jane", parent="harry")
tree.create_node("Bill", "bill", parent="harry")
tree.create_node("Diane", "diane", parent="jane")
tree.create_node("Mary", "mary", parent="diane")
tree.create_node("Mark", "mark", parent="jane")

def sort_tree(tree_dict, key_value_function, reverse=None):

  def child_key_value(child_tag):
    if isinstance(child_tag, dict): child_tag = next(iter(child_tag.keys()))
    return key_value_function(next(tree.filter_nodes(lambda node: node.tag == child_tag)))

  if isinstance(tree_dict, dict):
    tree_dict[next(iter(tree_dict.keys()))]['children'].sort(key=child_key_value, reverse=reverse)
    for child in tree_dict[next(iter(tree_dict.keys()))]['children']:
      sort_tree(key_value_function, key_value_function)

  return tree_dict

pprint(tree.to_dict())
print()
print(json.dumps(sort_tree(tree.to_dict(), lambda node: node.tag, True)))
daphnawegner commented 7 years ago

Thanks! is filter_nodes on the latest version? or its not released yet?

daphnawegner commented 7 years ago

Running your example I get the following error:

return key_value_function(next(tree.filter_nodes(lambda node: node.tag == child_tag)))
TypeError: list object is not an iterator
BebeSparkelSparkel commented 7 years ago

You must be using python 2.
I don't think filter_nodes is released yet on pypy so I switched to use filter.

This example should work for python 2 now.

from treelib import Node, Tree
from pprint import pprint
import json

tree = Tree()
tree.create_node("Harry", "harry")  # root node
tree.create_node("Jane", "jane", parent="harry")
tree.create_node("Bill", "bill", parent="harry")
tree.create_node("Diane", "diane", parent="jane")
tree.create_node("Mary", "mary", parent="diane")
tree.create_node("Mark", "mark", parent="jane")

def sort_tree(tree_dict, key_value_function, reverse=None):

  def child_key_value(child_tag):
    if isinstance(child_tag, dict): child_tag = next(iter(child_tag.keys()))
    return key_value_function(tree.filter_nodes(lambda node: node.tag == child_tag)[0])
    return key_value_function(filter(lambda node: node.tag == child_tag, tree.nodes)[0])

  if isinstance(tree_dict, dict):
    tree_dict[next(iter(tree_dict.keys()))]['children'].sort(key=child_key_value, reverse=reverse)
    for child in tree_dict[tree_dict.keys()[0]]['children']:
      sort_tree(key_value_function, key_value_function)

  return tree_dict

pprint(tree.to_dict())
print()
print(json.dumps(sort_tree(tree.to_dict(), lambda node: node.tag, True)))
daphnawegner commented 7 years ago

yes i am on python 2, it resolved that problem however it seems like node.tag is breaking now

return key_value_function(filter(lambda node: node.tag == child_tag, tree.nodes)[0])
AttributeError: 'str' object has no attribute 'tag'
BebeSparkelSparkel commented 7 years ago

Unfortunately I not getting that error when I run the example in python 2.7 or 2.6, so I'm not sure what is going wrong on your system.

Did you tweak the example at all?

I've added my test file to https://github.com/BebeSparkelSparkel/treelib/blob/ordered_tree/order_tree.py so try running that file.

daphnawegner commented 7 years ago

The only thing I tweaked is removing the first return line:

def child_key_value(child_tag):
    if isinstance(child_tag, dict): child_tag = next(iter(child_tag.keys()))
    **return key_value_function(tree.filter_nodes(lambda node: node.tag == child_tag)[0])**
    return key_value_function(filter(lambda node: node.tag == child_tag, tree.nodes)[0])
BebeSparkelSparkel commented 7 years ago

Ah. That would be the problem. I've removed the second in the file and it works. So, put back the first and remove the second.

daphnawegner commented 7 years ago

thanks a lot it works now!

BebeSparkelSparkel commented 7 years ago

You're welcome. Good luck.

daphnawegner commented 7 years ago

It seems to work for the first level but not recursively on all children, running this:

from treelib import Node, Tree
from pprint import pprint
import json

tree = Tree()
tree.create_node("Harry", "harry",data={"volume":570}) # root node
tree.create_node("Jane", "jane", parent="harry",data={"volume":270})
tree.create_node("Bill", "bill", parent="harry" ,data={"volume":300})
tree.create_node("Diane", "diane", parent="jane",data={"volume":120})
tree.create_node("Mary", "mary", parent="diane",data={"volume":10})
tree.create_node("Mark", "mark", parent="jane",data={"volume":150})

def sort_tree(tree_dict, key_value_function, reverse=None):

  def child_key_value(child_tag):
    if isinstance(child_tag, dict): child_tag = next(iter(child_tag.keys()))
    return key_value_function(tree.filter_nodes(lambda node: node.tag == child_tag)[0])
    # return key_value_function(filter(lambda node: node.tag == child_tag, tree.nodes)[0])

  if isinstance(tree_dict, dict):
    tree_dict[next(iter(tree_dict.keys()))]['children'].sort(key=child_key_value, reverse=reverse)
    for child in tree_dict[tree_dict.keys()[0]]['children']:
      sort_tree(key_value_function, key_value_function)

  return tree_dict

pprint(tree.to_dict())
print()
print(json.dumps(sort_tree(tree.to_dict(with_data=True), lambda node: node.data['volume'], True)))

sorts the children of "Harry" but not "Jane" children

Ailothaen commented 4 years ago

Hello, is there any news on this? I am using treelib to order Reddit comments (who have a tree-like structure), and I would like to be able to order them by date, by number of votes... Being able to sort nodes on the same level by a key in the nodes would be an awesome feature.

Duoquote commented 3 years ago

For now, I guess the best chance is putting something like "{:0>5}_".format(i) in front of every id in your desired order. That way the default sorter takes 00001...00015 as order. Not quite nice but it works.

salman0149 commented 2 years ago

For me tree.show(key=False) , worked for tree view, it wont sort alphabetically.

tomol012 commented 2 years ago

@salman0149 thank you so much! How is this not in the documentation? :( I had to use zfill() to get it to sort properly which was turned out to be a nightmare when I was inserting new nodes at a later stage. Only after a long while I figured out that the tree actually has proper sorting, just the .show() methods returns alphabetical order.

@caesar0301 would you be able to upgrade the documentation? Treelib is a fantastic library but there's so much one has to figure out on their own to make good use of it.

liamlundy commented 2 years ago

@salman0149 thank you so much! How is this not in the documentation? :( I had to use zfill() to get it to sort properly which was turned out to be a nightmare when I was inserting new nodes at a later stage. Only after a long while I figured out that the tree actually has proper sorting, just the .show() methods returns alphabetical order.

@caesar0301 would you be able to upgrade the documentation? Treelib is a fantastic library but there's so much one has to figure out on their own to make good use of it.

Documentation PRs are always welcome! :)