Closed jmarca closed 13 years ago
I think you can implement that straight-forward on your own just by defining a filter function. You can even use parameters in your notification filter that you set when calling your continuous feed. This allows you to pass the bbox.
I don't think it is straight-forward to access the geospatial index from within the filter function. Of course I can hand-check whether each element in the changes feed is in or out of the bounding box, but this is horrendously inefficient and the whole reason for this awesome GeoCouch work. I might be missing something though, so I will look at the code and see if there is a way to hook into the spatial index.
A filter function is applied on each new/changed item which thus is a candidate for the feed. So the filter function only has to run some number comparisons on a per-item basis. This is different than querying a high number of items using an index, where a R-Tree backed index improves efficiency thoroughly.
Suppose I have 100 users all asking for the changes since yesterday, each with a different bounding box. Suppose I have 2 million points scattered over a huge geographic region, and they all change every 30 seconds.
If the filter function runs once per change, there is only a small benefit to using spatial indexing. If it runs once per request for "changes" (which is my understanding of how it works, or else how would it be able to read the bbox attribute of the request), then there is a huge efficiency gain to be had. Searching for a bounding box overlap with a spatial index is fast; comparing x and then y to a bounding box for each record is not fast.
jmarca: the use case you explained last makes sense. Though there are several problems. One is, that indexes are normally updated on request, so for a valuable _changes feed they would need to be updated immediately after an insert. To cut the answer short: it's a lot of work, don't expect it to be done :)
vmx: yes, it is a long way off, as I said in the beginning. I think the first step is to integrate regular CouchDB indices with _changes, then this becomes a special case of that.
Maybe I'll use a spatial index to store references to CouchDBs, one per detector, so the work done by each _changes feed stays manageable and I only ask for what I need.
As to my original request, unfortunately, the most I can contribute is to recognize the problem. Perhaps this issue should be closed for now?
I close this one with "won't fix".
I'm sure this is a long way off, but it would be nice to have access to bounding box queries in _changes filters.
For example, if I have a database of traffic conditions, a changes continuous feed with a bbox parameter would just return the conditions that are changing within the boundary box.