elastic / elastic-agent

Elastic Agent - single, unified way to add monitoring for logs, metrics, and other types of data to a host.
Other
17 stars 144 forks source link

[Synthetics] Elastic Agent v2 does not support `synthetics/browser` inputs #1807

Closed dominiqueclarke closed 1 year ago

dominiqueclarke commented 1 year ago

Elastic Agent v2 does not support synthetics/browser input. The below error is reported from logs when running agent with a Synthetics Integration policy with a synthetics/browser input

synthetics/browser-default","state":"Failed","message":"input not supported","inputs":[{"id":"synthetics/browser-default-synthetics/browser-synthetics-4d650cba-d6e4-4c69-8a2c-e0563e79811a-905dca20-6f3c-11ed-a396-1dc11abfa4ee-default","state":"Failed","message":"input not supported"}],"output":{"id":"synthetics/browser-default","state":"Failed","message":"input not supported"}},"ecs.version":"1.6.0"}

It appears that agent doesn't support this input in the heartbeat spec.

emilioalvap commented 1 year ago

Expanding info, it seems the input config is being mutated before being received by the beat. Eg, icmp integration:

- id: >-
      synthetics/icmp-synthetics-ccfdfd04-7c1c-4c17-93ef-f0d22ab0a1d5-905dca20-6f3c-11ed-a396-1dc11abfa4ee-default
    name: 8.8.8.8-test-default
    revision: 1
    type: synthetics/icmp
    use_output: default
    meta:
      package:
        name: synthetics
        version: 0.11.2
    data_stream:
      namespace: default
    package_policy_id: >-
      ccfdfd04-7c1c-4c17-93ef-f0d22ab0a1d5-905dca20-6f3c-11ed-a396-1dc11abfa4ee-default
    streams:
      - id: ccfdfd04-7c1c-4c17-93ef-f0d22ab0a1d5
        name: 8.8.8.8
        type: icmp
        enabled: true
        data_stream:
          dataset: icmp
          type: synthetics
        __ui: null
        origin: ui
        hosts: 8.8.8.8
        schedule: '@every 3m'
        wait: 1s
        timeout: 16s
        processors:
          - add_observer_metadata:
              geo:
                name: test
          - add_fields:
              target: ''
              fields:
                monitor.fleet_managed: true
                config_id: ccfdfd04-7c1c-4c17-93ef-f0d22ab0a1d5

Is reporting a permission error when trying to index to metrics-icmp-synthetics, which is not the expected data stream:

action [indices:admin/auto_create] is unauthorized for API key id [----------------] of user 
[elastic/fleet-server] on indices [metrics-icmp-default], this action is granted by the index 
privileges [auto_configure,create_index,manage,all]

And browser integration is reporting an additional error, even when the monitor seems to be running:

{"log.level":"error","@timestamp":"2022-11-28T20:33:53.865Z","log.logger":"centralmgmt","log.origin":
{"file.name":"cfgfile/list.go","file.line":109},"message":"Error creating runner from config: could not parse cfg
 for datastream error unpacking monitor plugin config: string value is not set 
accessing type'","service.name":"heartbeat","ecs.version":"1.6.0"}
...
action [indices:admin/auto_create] is unauthorized for API key id [2Avyv4QBqAV1KuJ8Mo5i] of user 
[elastic/fleet-server] on indices [metrics-browser-default], this action is granted by the index 
privileges [auto_configure,create_index,manage,all]\"}, dropping event!"

@cmacknz, @fearful-symmetry can you venture what the problem could be here? We haven't really modified the interface for V2 ( and didn't expect having to) and it's quite weird to see metrics-icmp-synthetics as the target stream.

cmacknz commented 1 year ago

It seems the input config is being mutated before being received by the beat

Just to clarify, there are two places this could be happening.

  1. In the Beat control protocol client when it receives configurations.
  2. In the agent itself when it sends configurations.

Were you able to observe the problem specifically being introduced at either of these two points to help us narrow down where to look? Or is that still to be confirmed?

cmacknz commented 1 year ago

@michalpristas can you investigate why we are using the metrics-icmp-default datastream when we should be configuring synthetics-icmp-default? https://github.com/elastic/elastic-agent/issues/1807#issuecomment-1329748291

emilioalvap commented 1 year ago

Were you able to observe the problem specifically being introduced at either of these two points to help us narrow down where to look? Or is that still to be confirmed?

Didn't have enough time to dive into it yesterday, will try to figure it out today, I'll keep you posted

cmacknz commented 1 year ago

Made an attempt to reproduce this, managed to create an HTTP monitor via the integration without issue. It's not clear to me how to create an ICMP monitor. Is this supported though the agent integration or do we need to test this using standalone agents using a ICMP input like the one you posted in https://github.com/elastic/elastic-agent/issues/1807#issuecomment-1329748291?

fearful-symmetry commented 1 year ago

So I can solve a part of this right off the bat: looking at the elastic-agent repo now, there's no synthetics/browser input type in specs/heartbeat.yml, which is where the input not supported error is coming from.

The Error creating runner from config is a little more interesting. It's coming from heartbeat/monitors/stdfields/stdfields.go and I think it's from the validator sub-component in our config unmarshalling. This suggests the config in question is missing either the type or schedule fields.

cmacknz commented 1 year ago

We have another bug that we suspect is caused by the input type missing in the configuration: https://github.com/elastic/elastic-agent/issues/1800#issuecomment-1330017600

Not sure where it is getting dropped from yet.

cmacknz commented 1 year ago

The missing synthetics input has been solved already https://github.com/elastic/elastic-agent/pull/1812

michalpristas commented 1 year ago

havent looked into detail so far but having

func heartbeatCfg(rawIn *proto.UnitExpectedConfig, agentInfo *client.AgentInfo) ([]*reload.ConfigWithMeta, error) {
    //grab and properly format the input streams
    inputStreams, err := management.CreateInputsFromStreams(rawIn, "metrics", agentInfo)

hardcoded metrics what is then used

streamSource = injectIndexStream(raw, inputType, stream, streamSource)

but this should respect existing data_stream.type of config. maybe we just need to properly prepopulate. i will debug this tomorrow

fearful-symmetry commented 1 year ago

@emilioalvap okay, I think I can see the issue here, looking at the config @emilioalvap posted.

This error seems to be smoking gun: on indices [metrics-icmp-default]

The issue is that we're reading the stream data from the input level data_stream, which will result in the index metrics-icmp-default. If we read from the data_stream at the stream level, we'd get synthetics-icmp-default

fearful-symmetry commented 1 year ago

Trying to parse 10 different variables that all contain the word stream or input has shortwired my brain. @cmacknz / @emilioalvap can someone confirm that synthetics-icmp-default is the correct index name? If so, the fix is a one-liner.

fearful-symmetry commented 1 year ago

However, this does not fix the Error creating runner from config message, which I'm still baffled by.

cmacknz commented 1 year ago

synthetics-icmp-default is correct, we want the stream level data_stream configuration.

Error creating runner from config is a separate problem I think, we see the same thing with Filebeat where it is likely that the input type is missing or wrong: https://github.com/elastic/elastic-agent/issues/1800#issuecomment-1329955301

michalpristas commented 1 year ago

input type was missing in stream itself but was present at parent input definition ( for configs i was dealing with) i am also puzzled by naming, we have many streams inputs and units

cmacknz commented 1 year ago

However, this does not fix the Error creating runner from config message, which I'm still baffled by.

I think this (and https://github.com/elastic/elastic-agent/issues/1800) might be because we are not correctly looking at the top level input type for many of the Beats.

If we look at a sample configuration of a winlog input with several streams it has a top level type: winlog entry that describes the input type the agent should start.

Screen Shot 2022-12-01 at 2 09 21 PM

In the agent spec file, the winlog input is listed in the Filebeat spec file so the agent will start an instance of Filebeat to run it: https://github.com/elastic/elastic-agent/blob/982ae7dd4f3d1d0e5cf317fc3b1f2c9b4cb138c5/specs/filebeat.spec.yml#L189

On the Beat side in Filebeat we then hit this function which doesn't seem to look at the top level input type field at all, thus omitting it and causing Filebeat to configure an instance of the log input by default since it was omitted, but with the winlog configuration syntax.

func filebeatCfg(rawIn *proto.UnitExpectedConfig, agentInfo *client.AgentInfo) ([]*reload.ConfigWithMeta, error) {
    modules, err := management.CreateInputsFromStreams(rawIn, "logs", agentInfo)
    if err != nil {
        return nil, fmt.Errorf("error creating input list from raw expected config: %w", err)
    }

    // format for the reloadable list needed bythe cm.Reload() method
    configList, err := management.CreateReloadConfigFromInputs(modules)
    if err != nil {
        return nil, fmt.Errorf("error creating config for reloader: %w", err)
    }
    return configList, nil
}

This is in constrast to Metricbeat and Auditbeat which do look at the Type field to decide which module to start:

https://github.com/elastic/beats/blob/2a09bdd429d6fcb2d5f2a36b5ed1817f8e60214e/x-pack/metricbeat/cmd/agent.go#L17-L37:

    // Extract the module name from the type, usually in the form system/metric
    module := strings.Split(rawIn.Type, "/")[0] // <--- Correctly looking at the input type

    for iter := range modules {
        modules[iter]["module"] = module
    }

    // format for the reloadable list needed bythe cm.Reload() method
    configList, err := management.CreateReloadConfigFromInputs(modules)
cmacknz commented 1 year ago

I think packetbeat and heartbeat have the same or similar problems, but in the case of packetbeat it might be defaulting to the correct output when it is omitted. I think if we were mostly testing Filebeat with the log and filestream inputs it would work, with the caveat that instances of the filestream input are actually running the log input.

That may explain how we have only noticed this when starting to run other Filebeat inputs from integrations.

fearful-symmetry commented 1 year ago

Okay, spent a few hours working on a fix, there's something else going on here @cmacknz

Using the ICMP config posted above, the config worked fine, and I had to manually comment out the stream-level type item to reproduce the error posted earlier. This seems to suggest that heartbeat is parsing the config just fine, and there's something special about the browser integration. I can't get fleet configurator to generate one for me, as there seems to be an issue where it can't detect that it's running under agent, and keeps asking me to manually install hearbeat. @emilioalvap can you post the full browser fleet config?

The winlog config that @cmacknz posted appears to be in a similar situation; the winlog input has no type field at the stream level.

I can fix these individual issues, but this seems to raise a broader issue of why the presence of that stream-level type field is so inconsistent.

fearful-symmetry commented 1 year ago

It's not helping that the type fields aren't even consistent.

fearful-symmetry commented 1 year ago

Alright, a fix is here: https://github.com/elastic/beats/pull/33921

cmacknz commented 1 year ago

Closing, this should be resolved in BC6.

shahzad31 commented 1 year ago

@cmacknz @fearful-symmetry i just tested it on 8.7.0-SNAPSHOT, via elastic-package, i don't think it's fixed, adding a browser policy seems to make agent unhealthy, though i can't see any logs

shahzad31 commented 1 year ago

Ok false alarm, agent went unhealthy but it works

emilioalvap commented 1 year ago

@cmacknz @fearful-symmetry browser monitor data is segmented into data streams with different retention periods. These are reflected on the integration policy and assigned during runtime:

streams:
    - id: ab70b5e3-4932-41c1-855f-7df4ce33f143
      data_stream:
        dataset: browser
        type: synthetics
      ...
    - id: >-
        synthetics/browser-browser.network-ab70b5e3-4932-41c1-855f-7df4ce33f143-0973e900-7a35-11ed-9acb-b3da42cd0b97-default
      data_stream:
        dataset: browser.network
        type: synthetics
      ...
    - id: >-
        synthetics/browser-browser.screenshot-ab70b5e3-4932-41c1-855f-7df4ce33f143-0973e900-7a35-11ed-9acb-b3da42cd0b97-default
      data_stream:
        dataset: browser.screenshot
        type: synthetics

This used to work previously to V2, but it seems to be overwritten to a unique data stream now (synthetics-browser-default). Any thoughts as to why this is happening?

cmacknz commented 1 year ago

@emilioalvap can you open a new issue for this and attach the entire agent policy or ideally the archive generated by elastic-agent diagnostics collect while this is happening?

emilioalvap commented 1 year ago

Created a new issue with the relevant info: https://github.com/elastic/elastic-agent/issues/1931