Open jotegui opened 8 years ago
I do not know of a way to get counts efficiently AND accurately with GAE. However, for the case in question of small record sets, I believe the estimated count is a good enough estimate and could be used to make a determination.
You are right, @tucotuco , I was not familiar with Google's search
api and I guess I was expecting a bit too much, like a count
method or so... So, it seems the only way of counting records is to actually retrieve them and return the length of the array. sigh
Actually, given this difficulty and the current structure, I have been thinking on omitting this whole issue, and here is why:
count
from the users' perspective.portal-web
have already been implemented.search
API (like format
), where users can decide whether to get records in JSON
or TXT
format, they will be able to download via that method. But that makes the distinction between both methods a bit blurry...Again, just thinking out loud here...
I agree with all of these observations.
On Tue, May 24, 2016 at 8:01 AM, Javier Otegui notifications@github.com wrote:
You are right, @tucotuco https://github.com/tucotuco , I was not familiar with Google's search api and I guess I was expecting a bit too much, like a count method or so... So, it seems the only way of counting records is to actually retrieve them and return the length of the array. sigh
Actually, given this difficulty and the current structure, I have been thinking on omitting this whole issue, and here is why:
- There is little (if any) potential use for a method such as count from the users' perspective.
- Record counts are actually only useful for direct calls to the download api, since portal downloads come after a search event, where record count is already calculated. And direct downloads via the portal-web have already been implemented.
- If we enable a new parameter in the search API (like format), where users can decide whether to get records in JSON or TXT format, they will be able to download via that method. But that makes the distinction between both methods a bit blurry...
- We can use an approach such as GBIF's: put a hard limit on the number of records retrievable via direct call to the search API, and suggest to use the download API for larger searches...
Again, just thinking out loud here...
— You are receiving this because you were mentioned. Reply to this email directly or view it on GitHub https://github.com/VertNet/api/issues/3#issuecomment-221235796
Currently, the way counts are calculated imply retrieving the full list of records and then returning just the length of the array. This is highly inefficient (e.g. it took more than 2h to get the volume of records mentioning
mvz
)