Closed sjparkinson closed 1 year ago
Source type is heroku
from the HEC token configuration.
1065 <190>1 2022-09-06T11:42:21.414765+00:00 host app web.6 - {"event":"TOPIC_TRACKER_ENDPOINT_ERROR","route":"/article/:articleId","url":"/article/b8998b8e-2bbb-4f7a-8645-8d8a2e51a29c?numberOfArticles=8&source=app","method":"GET","error":{"message":"network timeout at: https://api.getcrux.io/article_score?user_id=...&article_id=b8998b8e-2bbb-4f7a-8645-8d8a2e51a29c&topic_id=c91b1fad-1097-468b-be82-9a8ff717d54c&num_recs=8&is_rec=false&attempt=0&source=app","stack":"FetchError: network timeout at: https://api.getcrux.io/article_score?user_id=...&article_id=b8998b8e-2bbb-4f7a-8645-8d8a2e51a29c&topic_id=c91b1fad-1097-468b-be82-9a8ff717d54c&num_recs=8&is_rec=false&attempt=0&source=app\n at Timeout.<anonymous> (/app/node_modules/@financial-times/n-express/node_modules/isomorphic-fetch/node_modules/node-fetch/lib/index.js:1484:13)\n at listOnTimeout (node:internal/timers:559:17)\n at processTimers (node:internal/timers:502:7)"},"level":"error","message":""}
SEDCMD
rules appliedSource type should be heroku:app
after TRANSFORMS-source_type
is applied (after the SEDCMD
rules are applied).
{"event":"TOPIC_TRACKER_ENDPOINT_ERROR","route":"/article/:articleId","url":"/article/b8998b8e-2bbb-4f7a-8645-8d8a2e51a29c?numberOfArticles=8&source=app","method":"GET","error":{"message":"network timeout at: https://api.getcrux.io/article_score?user_id=...&article_id=b8998b8e-2bbb-4f7a-8645-8d8a2e51a29c&topic_id=c91b1fad-1097-468b-be82-9a8ff717d54c&num_recs=8&is_rec=false&attempt=0&source=app","stack":"FetchError: network timeout at: https://api.getcrux.io/article_score?user_id=...&article_id=b8998b8e-2bbb-4f7a-8645-8d8a2e51a29c&topic_id=c91b1fad-1097-468b-be82-9a8ff717d54c&num_recs=8&is_rec=false&attempt=0&source=app\n at Timeout.<anonymous> (/app/node_modules/@financial-times/n-express/node_modules/isomorphic-fetch/node_modules/node-fetch/lib/index.js:1484:13)\n at listOnTimeout (node:internal/timers:559:17)\n at processTimers (node:internal/timers:502:7)"},"level":"error","message":""}
Source type is heroku
from the HEC token configuration.
278 <158>1 2022-09-06T11:42:21.397534+00:00 host heroku router - at=info method=HEAD path="/__gtg" host=specialist-preflight-us.herokuapp.com request_id=7f748199-e753-4b87-bd2f-a38f181d3575 fwd="10.0.0.0" dyno=web.1 connect=0ms service=10ms status=200 bytes=271 protocol=https
SEDCMD
rules appliedSource type should be heroku:router
after TRANSFORMS-source_type
is applied (after the SEDCMD
rules are applied).
2022-09-06T11:42:21.397534+00:00 heroku router - at=info method=HEAD path="/__gtg" host=specialist-preflight-us.herokuapp.com request_id=7f748199-e753-4b87-bd2f-a38f181d3575 fwd="10.0.0.0" dyno=web.1 connect=0ms service=10ms status=200 bytes=271 protocol=https
Dug into this further with Splunk on a call but didn't make much progress.
A good recent time to review is 15th September 2022 at 12–1pm, for the next-ads-api
system.
We're going to continue to monitor this and review if there will be any logs in the _internal
index to help debug why SEDCMD
on the source type is not getting applied.
The HEC monitoring console dashboard also suggests there are no issues with the data coming in from Heroku to the HTTP Event Collector endpoint.
Closing this as we've not seen issues since the 15th September 2022, and might have been resolved as part of our instance's routine maintenance.
Curiously we saw a burst of unprocessed events on 29th September again, but none since.
That day after observing the issue I raised the following changes, which might have fixed it, but not clear which. I'd bet on it being something similar to https://github.com/Financial-Times/splunk-heroku/issues/56.
Marking as closed completed, but will continue to monitor.
We are observing some events retain the
heroku
source type because theSEDCMD
rules are not getting applied, and subsequently the source type transforms are not applied as the regex won't match the non-seded events.https://github.com/Financial-Times/splunk-heroku/blob/0824fbb1ed02f75226a8efde77c4cb6bc1008ed6/app/default/props.conf#L10-L14
We've used regex101 to confirm that the regexes are correct and should be applied to these events.
My suspicion is that it's a event volume thing, something to do with the HEC endpoint perhaps.
It might also be something to do with where the app's configuration is installed on the cluster, it must be on the indexers (we don't control this from my understanding but Splunk will have a look at our instance).