Closed lreuven closed 4 years ago
2.Expose the config metadata of each agent via .yaml file
Agents should generate or hard-code a JSON definition of which options are supported.
The schema would be like this:
[
{
"key": "the configuration key, without the elastic_apm prefix. For example `active`.",
"type": "String|URL|Boolean|Double|Integer|List|Enum|TimeDuration|ByteValue",
"enum": ["a list of values allowed for this option, for example", "TRACE", "DEBUG", "INFO", "WARN", "ERROR"],
"category": "A string describing which category this option belongs to, allows the UI to group options. For example Core, Reporter, Stacktrace, Logging, HTTP, Messaging",
"default": "The default value of this option as a string",
"tags": ["An optional array of tags for this option. For example:", "performance", "security"],
"since": "The version of the agent when this option was introduced. For example 1.2.3",
"description": "A text describing the semantics of this option",
"validation": {
"min": "the min value for options such as transaction_sampling_rate or profiling_sampling_interval (inclusive). May be numerical or a string, for example for time durations like. Examples: `0`, `\"1ms\"`"
"max": "the max value (inclusive)"
"negativeMatch": false, // when set to true, the option must not be within the range (optional, default false)
"regex": "a regex pattern the option has to validate against"
}
},
...
]
When the agents send up this information on startup the UI can be rendered generically, based on the options each agent supports. As the UI doesn't hard-code the options, it can display new ones without having to update Kibana.
Open questions
List<URL>
, URL[]
, or List<WildcarMatcher>
? Due to type erasure, that would currently be quite tricky in the Java agent. If there are no type arguments, the UI could fall back to List<String>
so we could add it later as well.Generated options from the Java agent
@elastic/apm-agent-devs WDYT? Anything missing from here? @dgieselaar is this a definition of config options something you can work with? I guess all options would be indexed as individual documents and extended with the usual metadata (like ephemeral id and agent name).
To get a list of applicable options, you can do this:
key
WIP Java agent PR: https://github.com/elastic/apm-agent-java/pull/1046
Great stuff! Just a couple of questions/comments
since
sent as a nice-to-have or do you see any use for it on the UI side already? Regarding since
, the Python agent, and I assume most other agents, didn't really keep track of which version introduced which config option. git blame
to the rescue, but it's tedious work. That's why I was wondering if we'd have an immediate benefit from it. Maybe it's also enough to set since
to the version this feature will be released with for all existing config options, and then take it from there?
apm-server might need to add support for this (didn't look in detail), is there any target release?
One thing for sure we need is to know the subset of settings that RUM is going to apply.
apm-server might need to add support for this
Yes, I guess we need a separate endpoint for that. Agents would send their supported options on startup.
is there any target release?
It's scheduled for 7.7
One thing for sure we need is to know the subset of settings that RUM is going to apply.
IIRC, RUM is excluded for now.
@jalvz For RUM we don't have any plans to add more config options, so it should still be limited to what we support today (i.e. transactionSampleRate
)
It's scheduled for 7.7
The 7.7 target was additional predefined settings. I think we should separate the agent provided configuration settings from that expansion and find the right target for that, likely post 7.7.
@felixbarny I was actually thinking the agent would just send up one document, with options
as an array. I can imagine one document rather than many documents helps simplify things (for instance, we can do upserts on some kind of serialized id, and users can manage their data more easily). But I might be missing a good reason to send up one document per option. Here's what I was thinking:
We can then use a terms aggregation on agent.configuration.options.key
, with a top_hits sub-aggregation, sorted by @timestamp
:
GET apm-agent-configuration-options/_search
{
"size": 0,
"aggs": {
"options": {
"terms": {
"field": "agent.configuration.options.key",
"size": 100
},
"aggs": {
"sample_documents": {
"top_hits": {
"size": 1,
"sort": {
"@timestamp": "desc"
}
}
}
}
}
}
}
which would return:
Optionally we add service.name
, service.version
and agent.name
terms queries to limit the result set. We can also use a more specific sorting algorithm for the top_hits
aggregation, e.g. score documents that match more filters higher (similar to what we do when fetching agent configurations).
cc @elastic/apm-ui for anyone who has additional ideas/thoughts.
Perhaps we should specify the types for each setting according to their mapping in ES?
API_REQUEST_SIZE
is bytes. Should be string?API_REQUEST_TIME
is duration. Should be string?Btw. CAPTURE_BODY
default value is "off", not false
afaik.
Since this PR is scheduled for 7.7 I assume it refers to hardcoding the settings in the UI. Making it possible for agents to specify settings on the fly is 7.8+ and perhaps belongs in another issue.
Both API_
values are strings, correct. Here's the Ruby agent docs: https://www.elastic.co/guide/en/apm/agent/ruby/current/configuration.html#config-api-request-size
The agents are aligned on the format.
Capture_body docs: https://www.elastic.co/guide/en/apm/agent/ruby/current/configuration.html#config-capture-body (the table is missing, weirdly?)
Given that most options can be set via ENV
vars they can be given as strings and will be converted (talking only for the Ruby agent, but I expect the same from other agents?)
Because of this, I think you can type them however it would make the most sense to get the UI to work right.
Because of this, I think you can type them however it would make the most sense to get the UI to work right.
Would you prefer receiving API_REQUEST_TIME
as 3600000 or "1h" ? Similarly with API_REQUEST_SIZE
: 3072 vs "3kb"?
I think the safe option is to always include the unit, but it's fine to always be the same scale 3600000ms
or whatever. The agents could potentially react differently to plain numbers and the string format is what we have aligned on.
Perhaps we should specify the types for each setting according to their mapping in ES?
The problem is that this doesn't specify whether something is a list or a scalar. If the UI knows something is a list, it might render the input differently. Also, as discussed, users should be able to enter time durations or byte values like 10s
, 1mb
and there should ideally be validation that fails when entering 1h
(hours are not supported, only ms
, s
and m
, the regex is ^(-)?(\d+)(ms|s|m)$
), 1 ms
(space before the unit is disallowed) or 1mib
, for example (regex is ^(\d+)(b|kb|mb|gb)$
). We could send up the regex in the validation rules but I think the UI should eventually know about these data types. When doing that, the validation messages could be made nicer than just stating it doesn't validate against a given regex and we could make special inputs, for example, a number input combined with a dropdown listing all the available units.
As for when which part of that should be targeted for which version, I leave that up to @sqren @graphaelli and @nehaduggal.
I think the safe option is to always include the unit
Not specifying the unit is possible with some agents but only for backwards-compatibility reasons. So always specifying the unit is the way to go to avoid ambiguities.
I was actually thinking the agent would just send up one document, with options as an array.
I seem to recall something around problems with aggregations on arrays. But I might be wrong or it may not be a problem in this case.
As for when which part of that should be targeted for which version, I leave that up to @sqren @graphaelli and @nehaduggal.
It's the plan to add all of the above-mentioned settings for 7.7 (still hardcoded in the ui). Wrt the dynamic approach this may be a while out (most likely not 7.8).
If the settings will be hardcoded in 7.7, we probably should remove or at least somehow mark the options that are specific to the Java agent.
we probably should remove or at least somehow mark the options that are specific to the Java agent.
The java settings will only be displayed where applicable. Meaning: only if the user is creating a config for a java service or has selected the "All" option (in this case all settings will be displayed).
btw. I agree: would be very nice if the above table noted which of the settings are java specific
Is it possible to write directly into the configuration index or does the APM Server white-list certain options or expect them to be in a non-string data type?
Allowing that^ (in case it's currently not possible) might be enough for 7.7 (wdyt @nehaduggal ?). Then we can direct all eng resources towards making the UI dynamic for 7.8/7.9 without the "throwaway" work to statically support some more java specific options.
@felixbarny
Is it possible to write directly into the configuration index or does the APM Server white-list certain options or expect them to be in a non-string data type?
Yes. Previously there was a whitelist but now I've opened up the API on the Kibana-side so any string-based key/value pair is allowed:
{
"settings": {
"my_custom_java_setting": "true"
}
}
@jalvz Will this require changes on the APM Server side for the agents to be able to consume custom (non-whitelisted) options?
From apm-server perspective it should be fine
@lreuven I'm currently working on adding the new options to the UI, and have a few questions/favours to ask:
LOG_LEVEL
of type string (free form) or enum?List
type just comma-separated strings?Thanks!
Yes. Previously there was a whitelist but now I've opened up the API on the Kibana-side so any string-based key/value pair is allowed:
Cool! How does the endpoint of that API look like? Is that a Kibana API or is it something to be written directly in the ES index? That's probably disallowed because it's a system index, right?
Cool! How does the endpoint of that API look like? Is that a Kibana API or is it something to be written directly in the ES index? That's probably disallowed because it's a system index, right?
Yes, it's a Kibana API and documented here. The data is written to a system index, yes. So superusers can access it directly but normal users will have to use the API.
Is the expected behavior for the ACTIVE
config option that the agent stops itself when ACTIVE
is changed from true
to false
in the central config?
IIRC, in https://github.com/elastic/apm/issues/92#issuecomment-519752096 we agreed to deprecate active
config and introduce enabled
and recording
as its replacement.
@lreuven Having ENVIRONMENT
setting via remote configuration might be somewhat of "a chicken and a egg" problem because in remote configuration protocol the backend uses ENVIRONMENT
assigned to the agent to decide which configuration to return to the agent.
Having ENVIRONMENT setting via remote configuration might be somewhat of "a chicken and a egg" problem because in remote configuration protocol the backend uses ENVIRONMENT assigned to the agent to decide which configuration to return to the agent. @SergeyKleyman
Agree, I assummed this was a mistake but thanks for brining it up. From the UI (and APM Serve) perspective service.name
and environment
are conditions that are used to target a particular agent. They cannot be changed via remote configuration.
I've talked to agent and Kibana devs and we still have a mismatch what's in Kibana remote config for the agents vs what the agents support.
Go
api_request_size
api_request_time
log_level
server_timeout
Java
trace_methods_duration_threshold
RUM ✅
Node.js
active
api_request_size
api_request_time
capture_headers
log_level
server_timeout
stack_trace_limit
In other words, only these should be included:
capture_body
transaction_max_spans
transaction_sample_rate
Python All set after this PR is merged (scheduled for 7.7): https://github.com/elastic/apm-agent-python/pull/778
.NET All set after a few config options are made dynamic (scheduled for 7.7) https://github.com/elastic/apm-agent-dotnet/issues/794
Ruby All set after this PR is merged (scheduled for 7.7): https://github.com/elastic/apm-agent-ruby/pull/741
@elastic/apm-agent-devs please prioritize manually testing central config via Kibana. As we are already post feature freeze, please do your tests by end of next week (April 3rd).
@felixbarny I've created a follow up issue for removing those options for the specific agents https://github.com/elastic/kibana/issues/61821
Python support is merged. @beniwohli will do the manual testing this week.
Tested with the Go agent, recording
works. I realised I'm missing dynamic reloading support for a couple of other config attributes (span_frames_min_duration and stack_trace_limit), but I'll add them.
Also there's an issue with the unit selection for duration-type config: https://github.com/elastic/kibana/issues/62110
I've done manual testing with the Ruby agent for all options except for recording
. We still have to implement recording
/enabled
, tracked in this issue: https://github.com/elastic/apm-agent-ruby/issues/623
@estolfo recording/enabled are not required for 7.7 so I think you can check off Ruby above.
The recording
flag is required for 7.7. Enabled can land later as it's not a dynamic config and thus not available for central config.
Well, I think we're mostly there. I remember seeing a note about only Java being ready with recording
for 7.7 but now I'm not sure where that note is....maybe I made it up? Anyway, this is almost done in Python as well.
Closing this out as it has outlived its purpose. Let's take any outstanding tasks to new issues.
Description of the issue
Till version 7.6 , we have exposed 3 agents config options, CAPTURE_BODY, IGNORE_URLS & TRANSACTION_MAX_SPANS see here. From UI Impl perspective, those options were static in the UI and adding each new config requires UI dev.
We would like to move for the next phase and expose most of the config options from APM UI. here is the list of configuration options which we plan to support ( a unified cross agent list).
################################ there are few phases to this issue : 1.Align config options in agents - we still have few gaps between the agent as can be seen here. 2.Expose the config metadata of each agent via .yaml file 3.Send the the file to an end point 4.Consume & build the UI based on the file.