Open naveenchlsn opened 4 years ago
Hi Naveen, I'm not aware of a way to detect webhook failures without diagnosing the logs.
Hi @tidwall Thanks for your response. We have turned on the verbose logs and could see the following issue:
Description Tile 38 Is dropping hooks. We could see the count of ingested locations is not equal to count of hooks received. We are expecting at least one callback on one ingestion as we susbscribe to inside/outside events as well.
To Reproduce
SETHOOK <key> <grpcEndpoint> EXPIRY <3hours> WITHIN <key> FENCE OBJECT <geometry>
SET <key> <someId> EXPIRY <3hours> POINT <lat> <lng>
We are creating one hook for each key(internal geofenceId) and track one object entering/leaving the location. The grpcEndpoint remains same for all the keys.
Expected behavior Get a callback for each ingested location for key with inside/outside event.
Logs
"log":"2020/08/26 12:04:57 [DEBU] queued hook: 850122\n",
"log":"2020/08/26 12:54:14 [DEBU] purged hook
Operating System :
Additional context Following is the cardinality of key's, we have 10 QPS of sethook command with unique key. The SET command is fired at the rate of 500 QPS.
Looks like hooks are getting queued and purged after sometime.
Hi,
We want to know if there are any metrics emitted from tile38 to identify webhook not triggering.
We are using the geofence feature of the tile 38 in production at high throughput. We have noticed that the number of webhook fired was not matching the expected number of hook calls received. We have seen that CPU and memory usage are fine, but could not find any metrics on which we can set up alerts if webhooks are dropped in tile38.
We have looked at the stats and server command but they don't seem to be providing the required data on the geofence use case.
Please let us know if there any other commands we can leverage to get the metrics. I am aware we can use logs as mentioned in this issue, but would like to know if there is a way to emit metrics so alerts can be set up in production.
Thanks, Naveen