Open jsvd opened 7 years ago
About /_node/pipeline
and /_node/stats/pipeline
, the APIs should return the first registered "user facing" pipeline. By user facing it means that we can, in the future, have internal pipelines that ship with logstash who shouldn't be presented as "the default pipeline".
Also, we need to include an "id" field in the pipeline info/stats documents.
Wrt to /_node/stats/
, the ideal scenario would be to replace the pipeline
key with a pipelines
array/object, but we must keep bwc. thoughts?
About /_node/pipeline and /_node/stats/pipeline, the APIs should return the first registered "user facing" pipeline.
I am +1 on this. It would preserve BWC. We should also extend this API to take a pipeline id. For example,
GET /_node/pipeline
returns first pipeline (main
, for example)
Add pipeline ID in response.
"pipeline": {
"id": "main"
"workers": 4,
"batch_size": 125,
"batch_delay": 5,
"config_reload_automatic": false,
"config_reload_interval": 3
}
GET /_node/pipeline/:pipeline_id
Similar to above, but filtered by pipeline ID.
Similar to above for GET /_node/stats/pipeline
.
Wrt to /_node/stats/, the ideal scenario would be to replace the pipeline key with a pipelines array/object, but we must keep bwc. thoughts?
@jsvd this metrics API was marked experimental
just for this reason. At the time of 5.0, we knew about multipipelines, but we didn't know concretely how it would affect existing pipelines.
There is provision to break BWC here, but there also tools such as https://github.com/consulthys/logstashbeat that rely on this structure..
Another option is to use GET /_node/stats/pipelines
. Note the plural pipelines
here and deprecate the singular one (to be dropped in 6.0). This could return a pipelines
object which would be an array.
Ok, so:
GET /_node/pipeline
- info on first registered pipeline (usually main)
GET /_node/pipeline/:pipeline_id
- info on pipeline by this id
GET /_node/stats/pipeline
- stats on first registered pipeline (usually main)
GET /_node/stats/pipeline/:pipeline_id
- stats on pipeline by this id
Mark /_node/pipeline
and /_node/stats/pipeline
as deprecated (remove in 6.x)
Also add:
GET /_node/pipelines
- info on all registered pipelines
GET /_node/pipelines/:pipeline_id
- info on pipeline with this id
GET /_node/stats/pipelines
- stats on all registered pipelines
GET /_node/stats/pipelines/:pipeline_id
- stats on pipeline by this id
What is missing is the changes to the /_node/stats
document. WRT to the top level document of node stats, should the pipelines/pipeline key (example here) list all pipelines and their stats, or a summary of all?
if it's a summary we'll have to drop the plugins key, and maybe add a last_reloaded_pipeline_id?
I am +1 with your changes proposal and the dropping singular endpoint in 6.0.
I am in favor of having a summary and having and option to get the full details?
As a tool author, I would prefer to do only one call to the api to retrieve as much information as I can but, with plugins and multiples pipelines this output could get quite noisy.
implementation-wise, in terms of what we constitute as "first registered pipeline", there are two options:
a) rely on Hash's enumerable to get a {}.first
b) create a new setting, metric or global value that gets set on the first call to Agent#register_pipeline
c) Explicitly sort the existing pipelines by some criteria and select the first
Option a) is certainly easier but it's flaky, I don't believe we should rely on default sorting of keys in a Hash
Option b) suggests a metric, since this new variable that holds the name of the first registered pipeline must be accessible in the agent (to be set) and in the api code (to be read)
As for Option c), on the Agent side we could introduce a created_at
timestamp and sort by that, but we need to include this value on the metric side, so the api commands can reach it. Another alternative is to order by name of the pipeline.
Since this touches the Agent/Pipeline/Metric/Api barriers that @ph is re-evaluating, any thoughts?
New version:
Mark /_node/pipeline
and /_node/stats/pipeline
as deprecated (remove in 6.x)
With multiple pipeline off:
GET /_node/pipeline
- info on pipeline with id pipeline.id
(default: main)
GET /_node/stats/pipeline
- stats on pipeline with id pipeline.id
(default: main)
GET /_node/pipeline
- redirects to /_node/pipelines
GET /_node/stats/pipeline
- redirects to /_node/stats/pipelines
Per pipeline API:
GET /_node/pipeline/:pipeline_id
- redirects to /_node/pipelines/:pipeline_id
GET /_node/stats/pipeline/:pipeline_id
- redirects to /_node/stats/pipelines/:pipeline_id
New APIs:
GET /_node/pipelines
- ???
GET /_node/pipelines/:pipeline_id
- info on pipeline with this id
GET /_node/stats/pipelines
- overall stats across all pipelines (total number of events, reloads,etc)
GET /_node/stats/pipelines/:pipeline_id
- stats on pipeline by this id
multiple pipeline api support has landed in master, but it's now necessary to add in the 5.x branch a deprecation path to this changes, therefore I'm leaving this issue open to track the 5.x bwc layer
Currently, some of the apis assume only 1 pipeline exists. With the upcoming multiple pipeline feature, this needs to be addressed, with the caveat of keeping backwards compatibility.
Single Pipeline Assumptions:
1. Pipeline Info API
GET /_node/pipeline
2. Pipeline Stats API
GET /_node/stats/pipeline
3. Node Stats API
The node stats includes data from the pipeline stats api:
GET /_node/stats/pipeline