tidwall / rtree

An R-tree implementation for Go
MIT License
315 stars 30 forks source link

Performing deep copy? #9

Closed jfberry closed 1 year ago

jfberry commented 1 year ago

I have a potentially expensive rtree query operation. Since the rtree is not thread safe I currently use a mutex to perform this. Since the rtree is being continually updated this has a measured performance impact. Is it possible to get a deep copy of the rtree, or some other 'thread safe' access to a fixed view of the tree for searching? Or is this what the shallow copy actually achieves?

tidwall commented 1 year ago

For quick operations a RWMutex is usually good enough.

Is it possible to get a deep copy of the rtree, or some other 'thread safe' access to a fixed view of the tree for searching?

There isn't a deep copy function per se, but the Scan() function is a pretty quick way to retrieve every item in the tree.


items := make([]Item, 0, tr.Len())

mu.RLock()
tr.Scan(function(min, max []float64, item Item) bool {
        items = append(items, item);
        return;
})
mu.RUnlock()

// now 'items' has every item in the tree.

Though that may be too slow if you have a ton of items.

Or is this what the shallow copy actually achieves?

Yes, the Copy() function performs an instant copy using a copy-on-write technique.


mu.Lock()         // write lock needed
tr2 := tr.Copy(). // instant copy
mu.Unlock()

// now 'tr2' is a copy of 'tr'
jfberry commented 1 year ago

I have > 1million items in my tree; and want to use the geo scanning for Scan. In your second example, can I take tr2 and iterate that (with Scan) without having a threading problem against tr which will be still deleting and inserting values?

jfberry commented 1 year ago

(My Scan operation with my processing to selectively look at the items takes a few thousands milliseconds which is why I wish to try and find the best solution for holding the mutex). It's not the rtree iteration which is slow but my decision making.

tidwall commented 1 year ago

In your second example, can I take tr2 and iterate that (with Scan) without having a threading problem against tr which will be still deleting and inserting values?

Correct. There will be no threading problems.

The Copy() function is more-or-less an instance snapshot of the rtree. So any changes to the original rtree will not be available to the copied rtree, and vice-versa.

The only side-effect is due the nature of copy-on-write, where any changes (writes) to the copy or original will trigger internal copying of the nodes that are changed. But this is usually pretty quick (though not free).

jfberry commented 1 year ago

In your second example, can I take tr2 and iterate that (with Scan) without having a threading problem against tr which will be still deleting and inserting values?

Correct. There will be no threading problems.

The Copy() function is more-or-less an instance snapshot of the rtree. So any changes to the original rtree will not be available to the copied rtree, and vice-versa.

The only side-effect is due the nature of copy-on-write, where any changes (writes) to the copy or original will trigger internal copying of the nodes that are changed. But this is usually pretty quick (though not free).

Ok, I will give that a go :-)

jfberry commented 1 year ago

This appears to be working correctly :-) Thank you for a great library

tidwall commented 1 year ago

You’re welcome and I’m happy to hear it working for you.