mapbox / supercluster

A very fast geospatial point clustering library for browsers and Node.
ISC License
2.12k stars 300 forks source link

Possible aggregating issue with map/reduce. #150

Closed fabiovalse closed 4 years ago

fabiovalse commented 4 years ago

Hi, I'm trying to use map and reduce options to group my points according to a category. There are three categories: A, B and C. This is a sample point:

  {
    "type": "Feature",
    "properties": {
      "category": "A"
    },
    "geometry": {
      "type": "Point",
      "coordinates": ["28.28917", "-26.14584"]
    }
  }

My code is this:

const start = async () => {
  // Load points
  const points = await fetch('./points.json')
    .then(response => response.json())
    .then((data) => {
      return data;
    });

  // Create super cluster index
  const clusterIndex = new Supercluster({
    radius: 140,
    maxZoom: 17,
    map: (item) => ({
      categories: {
        [item.category]: 1,
      },
    }),
    reduce: (acc, cur) => {
      for (const category in cur.categories) {
        acc.categories[category] = (acc.categories[category] || 0) + cur.categories[category];
      }
    },
  });

  clusterIndex.load(points);

  // Get clusters given a bounding box and a zoom level
  const clusters = clusterIndex.getClusters([-7.416159899999997, -49.88668752867185, 51.910011975, -4.991106132448181], 4);
  console.log(clusters);
}

start();

The output of the getCluster is:

[{
  "type": "Feature",
  "id": 2004,
  "properties": {
    "categories": {
      "A": 1090,
      "B": 248,
      "C": 660
    },
    "cluster": true,
    "cluster_id": 2004,
    "point_count": 1410,
    "point_count_abbreviated": "1.4k"
  },
  "geometry": {
    "type": "Point",
    "coordinates": [28.202503718769, -26.472790528244175]
  }
}, {
  "type": "Feature",
  "id": 2068,
  "properties": {
    "categories": {
      "B": 128,
      "A": 136,
      "C": 324
    },
    "cluster": true,
    "cluster_id": 2068,
    "point_count": 588,
    "point_count_abbreviated": 588
  },
  "geometry": {
    "type": "Point",
    "coordinates": [18.482380755300586, -33.92539672163689]
  }
}]

The sum of the categories in the second cluster (id = 2068) is matching the point_count while it is not true for the first one (id = 2004). Is there something wrong in my map and reduce functions?

mourner commented 4 years ago

I haven't looked in detail (since there's no reproducible test case), but I'm guessing this has to do with categories object being passed as a reference somewhere and modified in multiple clusters at once. To check this, can you try removing the nesting and see if it works with A/B/C as properties?

fabiovalse commented 4 years ago

You were right. I updated my code to:

const start = async () => {
  // Load points
  const points = await fetch('./points.json')
    .then(response => response.json())
    .then((data) => {
      return data;
    });

  // Create super cluster index
  const clusterIndex = new Supercluster({
    radius: 140,
    maxZoom: 17,
    map: (item) => ({
      [item.category]: 1,
    }),
    reduce: (acc, cur) => {
      for (const category in cur) {
        acc[category] = (acc[category] || 0) + cur[category];
      }
    },
  });

  clusterIndex.load(points);

  // Get clusters given a bounding box and a zoom level
  const clusters = clusterIndex.getClusters([-7.416159899999997, -49.88668752867185, 51.910011975, -4.991106132448181], 4);
  console.log(clusters);
}

start();

and the sum of the categories is now correct. Thank you!

gandhiamarnadh commented 4 years ago

@mourner sorry to revive this old issue.

to use in clusterProperties with gl-js is it possible to create this map/reduce with expressions ?