Closed splunkericl closed 1 year ago
I am picking up this work and hoping to get some guidance. I am thinking of two approaches for the part to emit observability data for the heartbeat:
@atoulme @dmitryax do you have any advices on the approaches?
The target audience of the contrib components is the end users. Let's keep it that way end expose only user-configurable options, not programmatic. So let's add the observability data right in the exporter.
The exporter already has health_check_enabled
config option. It's currently triggered only during the start time, but I believe it can be enhanced to make such periodic calls and emit additional metrics based on that. If the option is disabled, which should be by default, no additional heartbeat observability metrics are emitted.
cc @wojtekzyla who introduced that option
The health_check_enabled
performs a HTTP check against /collector/event/health
, and has a different function as it captures the health of the remote counterpart. Here, we want to send the status of the collector to the remote counterpart on a regular basis.
Agree on starting from the user config first.
I would recommend we look at two additional config entries under a heartbeat key with the following defaults:
heartbeat:
enabled: false
interval: 30s
We generate the log message to send based off component.BuildInfo
which is passed in when creating the exporter.
Here, we want to send the status of the collector to the remote counterpart on a regular basis.
Is that the case? IFAIU, the goal is still to validate the destination, according to @splunkericl, no?
EP adds the ability to heartbeat the destination periodically to check the validity of the destination
I'd like to reuse the configuration field for something serving the same purpose. It's better to avoid adding new fields if they are not strictly necessary to reduce the mental load on the end user
Yeah the destination still needs to be validated. So if I understand correctly, we can:
healthcheckInterval
option in the config
validateTokenOnHealthcheck
:
You're using interchangeably healthcheck and heartbeat. This is creating confusion. These are not the same thing.
if it is 0 or not set, the health check will happen once during start up No, this would be a change of the behavior of the exporter. The exporter already has a health check configuration option on start.
There are some similarities between healthcheck
and heartbeat
. But if we want to define them clearly:
healthcheck
is the indicator whether the destination stack is up through a health endpointheartbeat
is whether exporter can send dummy data using provided token to destination stack periodically, which also requires destination stack to be up.So if it makes sense, we can put the heartbeat
functionality into the healthcheck
to reduce the number of functionalities end user needs to understand.
These are distinct concepts, and we'd want to toggle them on and off separately in my opinion. If anything, we will confuse users if we conflate those.
It doesn't look like heartbeat data is dummy data to me. It looks like diagnostic data sent to upstream so that operations can check that the collector is reporting in properly, even when no data is being sent.
With that, my definition of heartbeat
is:
heartbeat
sends periodical diagnostic updates indicating the health, version and information about the environment of the collector to the endpoint so it may be collected for diagnostics.That is fair. I think this definition makes sense and we should separate the configuration between healthcheck
and heartbeat
then.
This issue has been inactive for 60 days. It will be closed in 60 days if there is no activity. To ping code owners by adding a component label, see Adding Labels via Comments, or if you are unsure of which component this issue relates to, please ping @open-telemetry/collector-contrib-triagers
. If this issue is still relevant, please ping the code owners or leave a comment explaining why it is still relevant. Otherwise, please close it.
Pinging code owners:
See Adding Labels via Comments if you do not have permissions to add labels yourself.
Closing as done
Is your feature request related to a problem? Please describe. In Edge Processor(EP), it is possible the EP of a customer would have very low traffic for some periods of time. EP adds the ability to heartbeat the destination periodically to check the validity of the destination. For example, the S3 exporter in EP periodically calls listObjects to validate the credentials are still valid.
One of the requirement to incorporate SplunkHECExporter into Edge Processor(EP) is to adds this heart-beating ability to SplunkHECExporter and adds observability data on top of it.
Describe the solution you'd like Add an configuration option to allow heartbeating in
SplunkHECExporter
. If enabled the exporter would spawn a go routine that periodically sends dummy info to the destination. The dummy info can be similar to how a splunk forwarder is sending:_internal
index so it would work out of the box.Describe alternatives you've considered We can leverage the heartbeat information by checking how many events were sent successfully. However, this doesn't address the users that won't have data sent for a while.