paleobiodb / navigator

Graphical user interface for exploring space, time, and taxa in the PBDB
https://paleobiodb.org/navigator
Creative Commons Zero v1.0 Universal
40 stars 13 forks source link

Navigator number of collections and occurrences appear to be wrong. #98

Closed markuhen closed 6 years ago

markuhen commented 6 years ago

When using the PBDB Explorer and filtering on the Cenomanian stage, the system displays an occurrence count of 144395. When using the diversity end-point, a value of 20216 is returned for number of occurrences.

API Link: https://paleobiodb.org/data1.2/occs/diversity.json?interval=mesozoic&base_name=life

Screenshots: 31309274-9dfe55fa-ab51-11e7-90a0-30e17a03ba63 31309275-a4e9f130-ab51-11e7-93c6-d9a3a041252a 31309279-b774139e-ab51-11e7-8f83-276751097d4b

markuhen commented 6 years ago

Looking at a download of Cenomanian data, it looks like the number the Navigator is reporting as the number of total collections is very close to that which the downloader reports for the number of occurrences. The number of occurrences reported by Navigator seems WAY too high.

markuhen commented 6 years ago

This might be a recurrence of #57.

jpjenk commented 6 years ago

Investigating a potential error in computing the summary statistics in the data service. Navigator also appears to be making two identical calls to data1.2/colls

jczaplew commented 6 years ago

Using the API route that Navigator uses to filter by time, I was able to replicate this issue.

If you sum occurrences from this api route - https://paleobiodb.org/data1.2/colls/summary.json?lngmin=-180&lngmax=180&latmin=-90&latmax=90&show=time&level=3&interval_id=117 you get 147,355

vjpsyverson commented 6 years ago

So is this a Navigator issue, or is the API returning occurrence counts incorrectly?

mmcclenn commented 6 years ago

Fixed. Rewrote interval filter logic in generateMainFilters().