biodiv / anycluster

Server-side clustering of map markers for (Geo)Django
MIT License
106 stars 21 forks source link

ST_WITHIN #21

Closed lmorroni closed 10 years ago

lmorroni commented 10 years ago

Hi, I am trying to get getViewportContent working properly for some reporting I am doing. I can get back markers and I can customize the template. That all works fine. The problem for me appears to be with the database call. It doesn't return an accurate set of data based on the polygon I am passing. Have you ever had any experiences like this? It seems as though the SELECT statement inside views.loadAreaContent returns unreliable data. I think this may be a database issue on my end but before I dig too deep, I wanted to see if you have tested this function and if you had any suggestions. Thanks, Larry

biodiv commented 10 years ago

Hi Larry, Thanks for reporting. To narrow down the possible causes:

In general, there is one phenomenon: The area used for clustering is always a bit larger than the viewport as it uses a fixed grid, otherwise the clusters would move a lot if you pan the map. Imagine the following scenario:

You have one grid cell that has a lot of markers inside your viewport and some outside, so the geometric center is inside the viewport (and the cluster is visible). This geometric center can contain markers that are outside your viewport and thus won't be included in ST_WITHIN. The amount of those markers shouldn't be high as if there are a lot of markers outside the viewport - but within the cluster cell - the geometric center will be outside of the viewport.

lmorroni commented 10 years ago

On Jun 11, 2014, at 7:02 AM, biodiv notifications@github.com wrote:

Hi Larry, Thanks for reporting. To narrow down the possible causes:

Does your viewport contain clusters or markers or both? Both Did you count the markers on the viewport (clusters + single markers) and compare this with the count from ST_WITHIN (or how did you notice that there is an error)? Yes I counted and compared. In some cases the correct number was down to the right of the viewport. It was hard to tell exactly which off screen markers were in the results. How far off is the ST_WITHIN result? It varies but in general, it was off for every request I tested. I also tried playing with the map size on the page but that had no affect. As I had said, it looked like the bounds being passed to st_within were accurate but the results were skewed.
In general, there is one phenomenon: The area used for clustering is always a bit larger than the viewport as it uses a fixed grid, otherwise the clusters would move a lot if you pan the map. Imagine the following scenario:

You have one grid cell that has a lot of markers inside your viewport and some outside, so the geometric center is inside the viewport (and the cluster is visible). This geometric center can contain markers that are outside your viewport and thus won't be included in ST_WITHIN. The amount of those markers shouldn't be high as if there are a lot of markers outside the viewport - but within the cluster cell - the geometric center will be outside of the viewport.

— Reply to this email directly or view it on GitHub.

biodiv commented 10 years ago

I just checked the demo application. To test the validity of ST_WITHIN I would suggest to zoom in until only a few markers + a few clusters are visible on the map so one can count fast. On my test system with the demo app it counted correctly . However, on some markers the following phenomena can be observed:

I zoomed in and placed one cluster (containing 2) at the right bound so one half of it was inside the viewport and one half outside. I could count 14 markers (12 single, one 2-cluster) on the map. getViewPortContent returned 12. The 2 "missing" markers:

If we want 100% congruency on viewport and numbers on clusters we would have to bring the viewport in to the calculation of clusters. If the map is panned, the numbers on the clusters would have to be changed (or the cluster would have to be placed again on different coordinates).

By the way: My quick research unveiled that there was a Bug in ST_WITHIN: http://trac.osgeo.org/postgis/ticket/884 but it is old and already fixed.

lmorroni commented 10 years ago

I have not had any luck here yet. I decided to put this on the back burner while I focus on the reporting and polygon bounding boxes. I'll let you know once I get a better handle on the issue. I did verify that my PostGIS and kmeans are all the latest versions. Larry

lmorroni commented 10 years ago

I think I figured this out. My geom column was generated with a different version of PostGIS. I have been using the same database for testing for awhile. I just cleared out my database, reimported all my points which in turn generated new ST_GEOMETRY values for my geom field. I am now getting more accurate clustering results as well as better reporting. Nice!