stefankoegl / kdtree

A Python implementation of a kd-tree
ISC License
365 stars 118 forks source link

How do I keep track of data alongside my points? #34

Open matburnham opened 7 years ago

matburnham commented 7 years ago

Sorry, this is a bit of a question rather than an issue. However....

I'd like to use the namedtuple example to store my points in the tree, but would also like to store some data alongside the points. Is there an easy way to do this? Can it be added to the examples?

I've tried the following:

Point = collections.namedtuple('Point', 'x y name')
point2 = Point(lat, lng, name)
tree.add(point2)

But running this code fails:

ValueError: All Points in the point_list must have the same dimensionality

I guess I could make my own object which 'supports indexing' but as I don't yet know how to do that it seems like a bit of a faff.

stefankoegl commented 7 years ago

There is currently an open pullrequest #32 that contains an example. I agree that this is a bit unwieldy, so I am planning of modifying the check which is failing for you. It should probably require at least the specified dimensionality, instead of requiring it exactly.

I did not have much time to work on kdtree recently, so I don't have a timeline for that yet. If you'd consider writing a pull request, I'd make sure to review and integrate it promptly.

matburnham commented 7 years ago

Ah cool, thanks for the pointer to the example. In the meantime, I've bodged this workaround together:

class Point(collections.namedtuple('Point', 'x y')):
  name = ''
  def __str__(self):
    return '{} ({}, {})'.format(self.name, self.x, self.y)

It's good enough for what I wanted to do.

matburnham commented 7 years ago

Just started looking at putting together a pull request for the fix as suggested. But in starting with a test to reproduce my issue, I found this existing code in test.py which seems to do the job:

class PayloadTests(unittest.TestCase):
    """ test tree.add() with payload """

    def test_payload(self, nodes=100, dimensions=3):
        points = list(islice(random_points(dimensions=dimensions), 0, nodes))
        tree = kdtree.create(dimensions=dimensions)

        for i, p in enumerate(points):
            tree.add(p).payload = i

        for i, p in enumerate(points):
            self.assertEqual(i, tree.search_nn(p)[0].payload)

It looks like tree.add(p).payload = i is doing what I intended - adding a payload to a node - without needing the extra class defined. It seems far clearer to do it that way than how I proposed; in my case tree.add((x, y)).data = foobar. The only downside is that the visualisation doesn't display the payload data but that's part of the tradeoff of defining you own class.

It looks like this just needs documenting so the newbies like me can quickly identify the easy way to do it.