gbif / maps

GBIF mapping service built on HBase and ElasticSearch, supporting Mapbox Vector Tiles and PNGs
Apache License 2.0
48 stars 16 forks source link

capabilities response look like it has wrong counts #84

Open MortenHofft opened 9 months ago

MortenHofft commented 9 months ago

This dataset https://www.gbif.org/dataset/3ad913f9-2f86-48c7-b570-3ec9f71acf88 has 1786 records all of which has coordinates without geospatial issues https://api.gbif.org/v1/occurrence/search?dataset_key=3ad913f9-2f86-48c7-b570-3ec9f71acf88&has_coordinate=true&has_geospatial_issue=false

But the capabilities endpoint tell me there is 1 record in total. https://api.gbif.org/v2/map/occurrence/density/capabilities.json?datasetKey=3ad913f9-2f86-48c7-b570-3ec9f71acf88

capabilities was last updpated an hour ago. the dataset was last crawled 2 days ago with state: NOT_MODIFIED

ahakanzn commented 2 months ago

It seems like there is something wrong with that dataset particularly. We should take a look at that. https://www.gbif.org/dataset/3ad913f9-2f86-48c7-b570-3ec9f71acf88

muttcg commented 2 months ago

I happens because all records have the same lat, lon and date. And It is 3 sequence reads:

SELECT decimallatitude, decimallongitude, eventdate, materialsampleid
FROM occurrence
WHERE datasetkey = '3ad913f9-2f86-48c7-b570-3ec9f71acf88'
GROUP BY decimallatitude, decimallongitude, eventdate, materialsampleid;
decimallatitude decimallongitude eventdate materialsampleid
37.566 126.9784 2018-05-31 https://www.ebi.ac.uk/metagenomics/samples/ERS2540919
37.566 126.9784 2018-05-31 https://www.ebi.ac.uk/metagenomics/samples/ERS2540920
37.566 126.9784 2018-05-31 https://www.ebi.ac.uk/metagenomics/samples/ERS2540921

@MortenHofft Do you expect to have capabilities.total == datset.size?

MattBlissett commented 2 months ago

The total in the tile is incorrect, I think it should be 1786:

curl -Ss 'https://api.gbif.org/v2/map/occurrence/density/0/0/0.mvt?datasetKey=3ad913f9-2f86-48c7-b570-3ec9f71acf88' | vtd
layers {
  name: "occurrence"
  features {
    tags: 0
    tags: 0
    tags: 1
    tags: 1
    type: POINT
    geometry: 9
    geometry: 872
    geometry: 396
  }
  keys: "2018"
  keys: "total"
  values {
    sint_value: 1
  }
  values {
    sint_value: 1
  }
  extent: 512
  version: 1
}

So probably something is wrong with the map build.

MortenHofft commented 2 months ago

capabilities.total == datset.size

No the total is the number of points on the map. So records with coordinates that will show. whatever that filter is. I believe it is records with has_geospatial_issue=false. Essentially it tells me if there is a point in showing the map or not.

muttcg commented 2 months ago

The new UAT seems correct, but I haven’t found any significant differences in the code yet https://api.gbif-uat.org/v2/map/occurrence/density/capabilities.json?datasetKey=3ad913f9-2f86-48c7-b570-3ec9f71acf88

{
  "minLat": 37,
  "maxLat": 38,
  "minLng": 126,
  "maxLng": 127,
  "minYear": 2018,
  "maxYear": 2018,
  "total": 1786,
  "generated": "2024-08-28T03:00Z"
}