elastic / apm-server

https://www.elastic.co/guide/en/apm/guide/current/index.html
Other
1.22k stars 523 forks source link

Surface config parsing errors #14560

Open carsonip opened 6 days ago

carsonip commented 6 days ago

APM Server version (apm-server version): Confirmed on main

Description of the problem including expected versus actual behavior:

When a config with a wrong schema is passed to apm-server,


- via EA managed, config is passed to reloader, and error is apparently not surfaced, but beater/server will just silently not start.

**Steps to reproduce**:

Please include a *minimal* but *complete* recreation of the problem,
including server configuration, agent(s) used, etc.  The easier you make it
for us to reproduce it, the more likely that somebody will take the time to
look at it.

 1. Instead of setting apm-server.rum.response_headers to a list, set it to a string
 2. Observe the behavior under standalone and EA-managed
 3.

**Provide logs (if relevant)**:
carsonip commented 6 days ago

I haven't had the time to confirm the actual code path because one cannot simply run debugger on EA managed apm-server, but I suspect that the culprit is that the config parsing error is ignored by libbeat if it isn't a multierror.MultiError in https://github.com/elastic/beats/blob/38dfc52434ed2478b643d9b30f1a21c0e4c39603/x-pack/libbeat/management/managerV2.go#L836

inge4pres commented 6 days ago

When running in managed mode there's an elastic-agent-lib component that validates config for us, by parsing the YAML into a config.C struct. AFAIR a C.Validate() method can be implemented in our own types to specify rules applied when parsing the config. I am not 100% sure but there may be validate struct tags that can be added where the config type is defined

https://github.com/inge4pres/apm-server/blob/de7e8075d330c68d826a52613e13bcdf08d34620/internal/beater/config/config.go#L39

to include/exclude certain values.

simitt commented 2 days ago

@carsonip assigned you to this task as you already opened a draft PR. IMO this doesn't have highest urgency or impact, so leaving the timing for the fix up to you.