[Ingest Manager] Add support for multiples Kibana hosts.

ph commented 4 years ago

Summary of the problem The elastic agent should connect to more than Fleet hosts.

Users Stories

As a fleet administrator I should be able to define a list of Kibana hosts in the YAML
As a fleet administrator I should be able to define a list of Kibana Hosts in the UI.
Fleet should give the list of host to connect to the Elastic Agent in the configuration

Other Elastic agent already define a place in the configuration.


fleet:
  agent:
    id: fleet-agent-id
  access_api_key: VuaCfGcBCdbkQm-e5aOx:ui2lp2axTNmsyakw9tvNnw
  kibana:
    protocol: https
    hosts: [localhost:5601]
    timeout: 30s

elasticmachine commented 4 years ago

Pinging @elastic/ingest-management (Team:Ingest Management)

ph commented 4 years ago

how would path and protocol match all the hosts if we have more than one? that file seems to get created and populated from the host i pass into the enroll command

@neptunian I think we should simplify the case here and enforce some rules.

Case 1: Only IPs are different Resutls: VALID

KibanaA => https://192.168.1.2/ok KibanaB => https://192.168.1.3/ok KibanaC => https://192.168.1.4/ok

kibana:
    protocol: https
    hosts: [192.168.1.2, 192.168.1.3, 192.168.1.4]
    path: ok
    timeout: 30s

Case 2: Ports can be different

Resutls: VALID KibanaA => https://192.168.1.2:5551/ok KibanaB => https://192.168.1.3:5552/ok KibanaC => https://192.168.1.4:5553/ok

kibana:
    protocol: https
    hosts: [192.168.1.2:5551, 192.168.1.3:5552, 192.168.1.4:5553]
    path: ok
    timeout: 30s

**Case 3: Different Protocol* Result:** INVALID

KibanaA => https://192.168.1.2:5551/ok KibanaB => http://192.168.1.3:5552/ok TLS IS NOT ENABLED KibanaC => https://192.168.1.4:5553/ok

This case should fail, one of the node doesn't have the same protocol as the other.

Only the HOSTS can be different everything else should be constant.

I've double checked the beats implementation and everything is like that, we do round robin on the hosts.

neptunian commented 4 years ago

Are we wanting to enforce these rules in the UI? And what if they've already entered them into the kibana config yaml file? Should we validate that as well for these rules?

A separate question. The elasticsearch output is a combo box, allowing multiple urls:

But it seems the kibana config xpack.ingestManager.fleet.elasticsearch.host does not support an array of urls. Any reason for this?

ruflin commented 4 years ago

We must enforce this rule on the API level. But I would also be fine to not have the rules at first and only allow a list to move this forward.

For the Elasticsearch url, I had a conversation with @nchaulet in the past that it is confusing that it can be configured in two places. We agree, there should be only 1 place for this config and the other one should be removed. My personal preference is to have it in the UI as it allows to make changes without requiring a restart of Kibana. This becomes especially useful in case more Kibana instances are added or in the case of Elasticsearch, that we support multiple Elasticsearch outputs in the future. If the users has to restart Kibana every time such a change is made, I don't think this is expected.

@ph Lets agree on the path forward on the above in a separate issue and make a call for 7.10.

neptunian commented 4 years ago

@ruflin

For the Elasticsearch url, I had a conversation with @nchaulet in the past that it is confusing that it can be configured in two places. We agree, there should be only 1 place for this config and the other one should be removed.

This is the same case for the kibana urls. It can be configured in the kibana yaml config file or in the UI, but it only works the first time in the yaml file.

This becomes especially useful in case more Kibana instances are added or in the case of Elasticsearch, that we support multiple Elasticsearch outputs in the future.

Do we not currently support multiple elasticsearch outputs? Why do we allow them to enter in multiple urls otherwise?

If the users has to restart Kibana every time such a change is made, I don't think this is expected.

If you change the yaml config urls and restart kibana, they are ignored. It only works the first time (before we add some default url during setup) which is intended as @ph informed me. it's meant to function as a bootstrap.

neptunian commented 4 years ago

@ph can you update the description to replace the general word configuration with something more descriptive? Like "agent policy" ?

nchaulet commented 4 years ago

Do we not currently support multiple elasticsearch outputs? Why do we allow them to enter in multiple urls otherwise?

We on support 1 elasticsearch cluster output, but you can send data to different node

ph commented 4 years ago

@ruflin is what you had in mind is the following;

When kibana starts we pickup the current kibana host.
If a user need to support multiple hosts they make an api calls to configure the hosts.

neptunian commented 4 years ago

@ph For user story "Fleet should give the list of host to connect to the Elastic Agent in the configuration", @nchaulet communicated to me that this currently only happens through the CLI when the user enrolls an agent. To clarify, is this user story for the CLI to support a list of urls? Or are we wanting to communicate the list of kibana urls a different way?

ruflin commented 4 years ago

@ph Your summary is correct. I think it is odd that a config is only used for bootstrapping.

One thing that bugs me about removing the config completely is for the use cases where someone deploys Kibana with lets say Puppet and wants to all the setup via config file which is not possible. At the same time, to fully setup and prepare Ingest Management, API calls are needed anyways.

neptunian commented 4 years ago

@ph @ruflin In order to trigger the CONFIG_CHANGE action I would need to update the agent configuration SO and bump the revision. I would need to add a new attribute kibana_url to the agent config to update. My concern is that every time the settings are saved I would need to go through and update all the agent configurations. Is there a concern in the scenario that there are many agent configs? I don't think we currently have a use case where we update all agent configurations because of a single global settings change? Since the kibana_url is always the same for every agent config, perhaps the agent configuration is not the best place for it in the longer term and a new action is needed for when settings is updated?

jen-huang commented 4 years ago

Jumping in here because I'm nosy ☺️

In order to trigger the CONFIG_CHANGE action I would need to update the agent configuration SO and bump the revision. I would need to add a new attribute kibana_url to the agent config to update. My concern is that every time the settings are saved I would need to go through and update all the agent configurations.

I don't think we need to add a new kibana_url attribute to the agent policy object. The full agent YAML is generated by using the agent policy SO but we also add fields to it at the time of generation (i.e. outputs block): LOC

What do you think of: 1) Add a similar kibana block here at the time of agent YAML generation 2) When Kibana host settings are changed, bump agent policy revision without actually changing any fields, thus triggering CONFIG_CHANGE action. I think this can be done by calling agentPolicyService.update() with an empty object for the agentPolicy param

The caveat in our current agent policy service implementation is that we can only update/bump revision of a policy one at a time, which is probably not performant. This can be done better by implementing bulkUpdate() and _bulkUpdate() that leverage soClient.bulkUpdate, or overloading the existing update/_update() methods to accept an array of IDs.

AFAIK soClient.bulkUpdate can't perform updates by query :( So you will first need to retrieve a list of all agent policy IDs. I am currently running into similar problems with work on implementing bulk actions.

neptunian commented 4 years ago

Thanks @jen-huang! If we're okay with bumping revisions without actually changing fields, this sounds good to me. Although, it is too bad that we need to read the revision of the current saved object policy in order to bump it. So I would need to fetch all the policy saved objects and locally update each policy's revision before applying the bulkUpdate. It seems a bit hacky. There is also the possibility of errors when updating all these agent policies that now exist when using the settings endpoint / saving settings. Is having a new action that observes the Settings too much work?

ph commented 4 years ago

I would try as much as possible to limits the number of action we have to implement and keep the configuration as a single atomic "concept".

See in our configuration in the Elastic-Agent we already a place for that information, its important that you keep it that way. We are actually giving that information down to endpoint so they can connect to Kibana and query their manifest.

https://github.com/elastic/beats/blob/d34d1b654b17a3e87069f8f8354c0d3390a44bd9/x-pack/elastic-agent/pkg/agent/program/testdata/single_config-endpoint-security.yml#L1-L10

@blakerouse I know that the above host key support array of host, should that key be plurial?

cc @nchaulet for awareness here, since you are working with the performance side of Kibana, it would be good to have your input here.

nchaulet commented 4 years ago

I am wondering if we can have support for script update in saved object API it will make this a lot more performant, otherwise we need to fetch each config before updating it.

something like

POST test/_update/1
{
  "script" : {
    "source": "ctx._source.revision += 1",
    "lang": "painless",
  }
}

neptunian commented 4 years ago

@ph The full agent policy sent to the Agent would just have the new fleet attribute:

screenshots of checkin response and yaml

This is step 1 of what Jen mentioned in her comment which is easy enough. It's getting the CONFIG_CHANGE action to trigger which only happens when the agent policy SO is revised that is an issue.

I guess I was thinking it would be good if there was a way to not have to bump every agent policy config when they update the kibana_url in settings but since that conceptually is part of the config, then it sounds like this might be our best option.

@nchaulet Looking at the current update code, should we be bumping a policy's revision that has an inactive status?

ph commented 4 years ago

@neptunian Looks good, I have confused by the _kibanaurl reference, I will check but I think we will need to improve support on agent side for parsing the URLs.

neptunian commented 4 years ago

@ph okay, updated my comment re: configuration as a single concept

jen-huang commented 4 years ago

@nchaulet We may run into unintended consequences down the line if we edit the saved object document directly, that's why Platform always pushes to use the scoped saved object client when working with SOs. @neptunian it might be a good idea to have a chat with Platform about your performance concerns and get their recommendation.

There is also a bulkGet method on soClient, so you don't have to retrieve each policy one by one.

ruflin commented 4 years ago

As the url is part of the config, I think bumping the revision is a must and expected, it is a new config.

What happens if someone changes the Kibana hosts to a host not available? Will the agent disconnect or refuse the new connection?

neptunian commented 4 years ago

@ruflin I think the thing that I found confusing is the url is not part of the ingest-agent-policies saved object, which is where the revision is bumped. But it is part of the "full agent policy" created and sent to the agent which represents the actual agent config.

blakerouse commented 4 years ago

@ruflin I believe at the moment Agent is not handling the url coming down in the configuration. We still need to define that behavior. I agree that Agent should do a sanity check to ensure that it can talk over the new API before switching and making that change permanent.

ruflin commented 4 years ago

As the url of Kibana can already today be changed in the Settings, I assume most of the above discussion applies independent if it now becomes an array or not? Should we perhaps split these into 2 parts:

Support for multiple Kibana hosts
What happens when Kibana host(s) are updated

neptunian commented 4 years ago

Yes its possible to split it if we want the update action to go into 7.9, with the updating action PR needing to happen first before the kibana_url type is changed into an array.

If we're wanting to get both kibana url and elasticsearch url https://github.com/elastic/kibana/issues/76136 updating the agent config, this would require more changes than in my current PR as elasticserach URL is not part of settings saved object or API, but part of the output api. It seems like they should happen in the same PR. I assume changes will need to be made on the agent side to handle the modified agent policy (@blakerouse) now containing these urls.

blakerouse commented 4 years ago

@neptunian Agent will handle updated output already. I do not believe it will handle an updated fleet in 7.9. I think that is something we would need to fix in 7.10.

ph commented 4 years ago

Yes, @blakerouse I do not expect it in 7.9 but 7.10. I think we have an issue open for that for the agent. I will make sure we raised that.

maltewhiite commented 2 years ago

@ph Your summary is correct. I think it is odd that a config is only used for bootstrapping.

One thing that bugs me about removing the config completely is for the use cases where someone deploys Kibana with lets say Puppet and wants to all the setup via config file which is not possible. At the same time, to fully setup and prepare Ingest Management, API calls are needed anyways.

We are using Puppet for everything. Does your https://forge.puppet.com/modules/elastic/kibana module support Fleet yet?

elastic / kibana

[Ingest Manager] Add support for multiples Kibana hosts. #72731