Closed ChrsMark closed 2 years ago
Pinging @elastic/fleet (Team:Fleet)
- Fleet retrieves all the available packages/integrations from the Registry.
I hope we don't mean every package? This is not going to scale well as we get to 1k+ packages. Fleet can't be downloading all of the packages to produce this, I think we either need:
Another question is where in the UI should this be shown? Only in the standalone agent configuration UI?
- Fleet retrieves all the available packages/integrations from the Registry.
I hope we don't mean every package? This is not going to scale well as we get to 1k+ packages. Fleet can't be downloading all of the packages to produce this, I think we either need:
How much this information would be in terms of bytes? Note that we only care about the package spec and not the assets at this point. Could a caching mechanism in Kibana help here?
- The user to select which integrations they may want to use; and/or
- A default, constrained set of popular packages that we want to support out the box with the ability to add additional packages
I think that would work but would not comply with the goal of the feature which is to provide minimal configuration steps. Imagine that these templates could be completely hidden from the user since those only act as low level implementation detail. However if including all these is proved to not be performant we need to revisit and re-consider this.
Another question is where in the UI should this be shown? Only in the standalone agent configuration UI?
Yes to my mind this should be shown only in the standalone agent configuration UI
.
@mlunadia @gizas any thoughts on the above comments/concerns?
How much this information would be in terms of bytes? Note that we only care about the package spec and not the assets at this point. Could a caching mechanism in Kibana help here?
We have the package registry API that can provide some info, e.g. https://epr.elastic.co/search?experimental=true but this doesn't include detailed information about variables like default values, types, etc. In order to resolve that level of detail, we need to either query for that list linked above -> query the API for each individual package, e.g. https://epr.elastic.co/package/1password/1.4.0/ or we need to download every package.
Fleet does some in-memory caching of downloaded packages to save on repeated downloads of packages, but that doesn't help us in the "cold start" case where we need to download every single package to determine if and how Fleet should be generating Kubernetes templates for them. I agree with @joshdover's points above that we need some way to limit the list of packages we're querying here, either by user input or by a hardcoded allow list.
Packages can easily be several megabytes on average, and if a package ships with prebuilt assets like ML jobs it can be quite a bit larger.
Out of curiosity I drafted the following python script to measure what we are discussing:
Running this script from my local machine I manage to download all the packages in less than 2 minutes and the total storage used seems to be 117M.
$ time ./packagesTest/get_packages.py
....
3.10s user 1.89s system 4% cpu 1:43.66 total
$ ls -l packagesTest | wc -l
154
$ du -sh packagesTest
117M packagesTest
Are those numbers expected @kpollich ? I wonder if those numbers are actually risky in terms of performance, since this action should take place only once on Kibana's "first" load time and then the constructed ConfigMap
can be cached. Would a background job along with the caching help here?
If these indicators are concerning then I would do a step back and re-consider the approach/solution. To the specific comment
I agree with @joshdover's points above that we need some way to limit the list of packages we're querying here, either by user input or by a hardcoded allow list.
how do you think of this selection? Would that mean that by-default we only select some packages but we also provide the option to users to select and download all of them? Wouldn't that lead to the risk of downloading everything again if users are choosing "select all"?
One thing that I would like to make clear here is the purpose of this feature. Hints' based autodiscovery
serves for the cases where users want full automation and no "restarts" with as minimal configuration as possible. So having the users to select/deselect the packages to be included makes us diverge from the purpose. In addition to this, if for any reason users want to add sth more they will need to go back to Fleet UI and regenerate the templates and finally restart the Agent.
This is more of a hybrid approach and not fully automated based on hints.
Having said this, I think that if Kibana and Fleet UI cannot solve this issue efficiently we need to reconsider.
Some quick alternatives here:
elastic-package
. Then we will have sth like elastic-package createk8sTemplates
which would provide us the wanted ConfigMap
. With something like this we could even have a nightly job to upload this ConfigMap at https://github.com/elastic/elastic-agent/tree/main/deploy/kubernetes/elastic-agent-standalone if there is any diff. Then only thing that users need to do is to download https://github.com/elastic/elastic-agent/tree/main/deploy/kubernetes/elastic-agent-standalone and deploy, which exactly what they are doing today. Keep in mind that even today the standalone policy we provide is quite static and not frequently updated at https://github.com/elastic/elastic-agent/blob/main/deploy/kubernetes/elastic-agent-standalone/elastic-agent-standalone-daemonset-configmap.yaml#L27, but this is somehow expected when it comes to standalone experience.cc: @gizas
I would even consider forgetting about supporting this on Fleet UI and implement the template+ConfigMap construction in elastic-package. Then we will have sth like elastic-package createk8sTemplates which would provide us the wanted ConfigMap.
This is a good idea. Each time there is a new update in one of the packages(in the vars of the data streams?), a new ConfigMap will be constructed and a PR can be opened to Kibana project as well to update the https://github.com/elastic/kibana/blob/main/x-pack/plugins/fleet/server/services/elastic_agent_manifest.ts#L8 which is currently used for the standalone agent. This is not expected to happen very often.
To my mind updating https://github.com/elastic/kibana/blob/main/x-pack/plugins/fleet/server/services/elastic_agent_manifest.ts#L8 is another story that is irrelevant to the templates construction and should be handled on top. For example even today if we change https://github.com/elastic/elastic-agent/tree/main/deploy/kubernetes/elastic-agent-standalone then Kibana's part will be outdated. Based on this maybe updating Kibana's should happen in any case if changes are detected at https://github.com/elastic/elastic-agent/tree/main/deploy/kubernetes/elastic-agent-standalone.
Trying to follow all above and synced with Christos on the small details, and just some clarifications:
Ok, I see that this thread expanded quickly, so let me clarify few things as Ecosystem owns elastic-package and package-registry.
Would that be possible instead of downloading all the artifacts for the packages to only download the packages' spec from https://github.com/elastic/package-storage/tree/production/packages directly?
package-storage as repository will be deprecated soon. By soon, I mean the end of July/August. We will switch to https://package-storage.elastic.co/ which is based on buckets. I strongly recommend not considering package-storage v1 (Git) as a component.
We have the package registry API that can provide some info, e.g. https://epr.elastic.co/search?experimental=true but this doesn't include detailed information about variables like default values, types, etc. In order to resolve that level of detail, we need to either query for that list linked above -> query the API for each individual package, e.g. https://epr.elastic.co/package/1password/1.4.0/ or we need to download every package.
We don't plan to extend EPR to perform any extra logic apart from serving package indices and redirecting to package-storage to download .zip or static artifacts.
I would add the idea to create the k8stemplate on epr side. As long as epr downloads packages, why can we create the templates there and to be served with another api request?
EPR is intended to be a static component with a simple search facility. We don't aim to put extra processing logic there.
So you dont let us many possibilities there :)
I guess the only 2 final candidates are:
Also @mtojek how about the part:
2. and implement the template+ConfigMap construction in
elastic-package
. Then we will have sth likeelastic-package createk8sTemplates
which would provide us the wantedConfigMap
Can we plan for it? Do you see any issues?
- and implement the template+ConfigMap construction in elastic-package. Then we will have sth like elastic-package createk8sTemplates which would provide us the wanted ConfigMap
It is something I'd like to understand better as elastic-package's actions refer to development lifecycle (build
, lint
, format
, test
, stack
, etc.). I don't see how the createk8sTemplates
action fits there, but maybe we can evaluate/rephrase it.
- and implement the template+ConfigMap construction in elastic-package. Then we will have sth like elastic-package createk8sTemplates which would provide us the wanted ConfigMap
It is something I'd like to understand better as elastic-package's actions refer to development lifecycle (
build
,lint
,format
,test
,stack
, etc.). I don't see how thecreatek8sTemplates
action fits there, but maybe we can evaluate/rephrase it.
The goal here is simple. We want to produce a static kubernetes ConfigMap
with the templates from the latest versions of packages (for now). In order to make it available to our users we can store it in the upstream repository at https://github.com/elastic/elastic-agent/blob/main/deploy/kubernetes/elastic-agent-standalone/elastic-agent-standalone-daemonset-configmap.yaml. Our official docs at the moment redirect our users to download our proposed manifests from there (see docs).
So a developer from cloudnative team that maintains these manifests mainly would need a tool to automate the construction of this ConfigMap
. This is where elastic-package
comes into play.
In order to automate this process even more, after we have the tooling implemented we can add it in a nightly automation run which would re-run the elastic-package createk8sTemplates
command and will check for diffs with the upstream at https://github.com/elastic/elastic-agent/tree/main/deploy/kubernetes/elastic-agent-standalone opening a PR if there is a diff.
In this way users following our docs will only have to curl -L -O https://raw.githubusercontent.com/elastic/elastic-agent/8.3/deploy/kubernetes/elastic-agent-standalone-kubernetes.yaml
the same way they do today and the hints feature will be available for them in a transparent way.
To make the proposal more complete, Kibana's side should be synced according to https://raw.githubusercontent.com/elastic/elastic-agent/8.3/deploy/kubernetes/elastic-agent-standalone-kubernetes.yaml from time to time in order to have the "hardcoded" manifest (https://github.com/elastic/kibana/blob/main/x-pack/plugins/fleet/server/services/elastic_agent_manifest.ts#L8) up to date. But this is a need that even exists today since the hardcoded manifest is not getting updated if something changes at https://raw.githubusercontent.com/elastic/elastic-agent/8.3/deploy/kubernetes/elastic-agent-standalone-kubernetes.yaml. Again the creation and existence of the new templates' ConfigMap
is an implementation detail and hidden from the end user.
I wonder if those numbers are actually risky in terms of performance, since this action should take place only once on Kibana's "first" load time and then the constructed ConfigMap can be cached. Would a background job along with the caching help here?
Fleet's setup process on boot blocks Kibana's healthy status, so adding 2+ minutes of degraded
time to Kibana on boot is a nonstarter here. A background job seems like a better fit if we place responsibility for the generation of these ConfigMap
objects on Kibana.
To create them on daily basis and keep them somewhere where can be picked from kibana on start
This is a better solution in my mind. If the ConfigMap
object is truly a static list of every single package that supports Kubernetes autodiscovery hints, it doesn't seem necessary for Kibana to generate that list "on-demand". I expect the rate of change for these hints to be fairly slow, so handling them through a CI job seems a lot better to me.
how do you think of this selection? Would that mean that by-default we only select some packages but we also provide the option to users to select and download all of them? Wouldn't that lead to the risk of downloading everything again if users are choosing "select all"?
I guess I just don't fully understand the use case here. To me, it seems like there'd be an overwhelming amount of config I might not need in this ConfigMap
object. For example if we include k8's hints in 10-15 packages the ConfigMap
is going to include definition blocks for each of them. To me, this seems like a lot of noise and area for confusion - but then again I am a total novice with Kubernetes, so my understanding of this use case is limited.
You are correct though. If we allow selection here we still run the risk of downloading all packages in order to resolve the default
for each variable.
The solution of a static ConfigMap
maintained by CI feels the safest to me.
Folks, I'm afraid that you're forgetting about the scaling factor. We need to think about the situation where we have 1kk packages. Do we want to keep updating config maps at that scale?
Also, how do you plan to support those config maps if the format depends on the Elastic stack version?
I suggest going back to square one and rethinking the procedure. Generating templates on a nightly basis and introducing coupling between packages and Fleet doesn't sound like a safe choice. What if we start accepting community packages? We won't be able to store information about community pkgs in Kibana.
@kpollich @gizas fyi, we had a chat with @mtojek to make things more clear. What we will be evaluating is implementing the template construction compatible with kubernetes hints
in a CI component similarly to what we have buckets' indexing etc.
In that case the ConfigMap
with the templates will be constructed asynchronously by a job and will be available to our users through https://github.com/elastic/elastic-agent/tree/main/deploy/kubernetes/elastic-agent-standalone. With that we have the same UX that we have today as described at https://www.elastic.co/guide/en/fleet/current/running-on-kubernetes-standalone.html and Kibana/Fleet are not somehow affected.
We will only support having the "templates" based on latest packages since the logic in standalone is decoupled from packages' updates/versions etc, and we just need input policies that work with the defined Agent. This would mean that for 8.3
we will ship a manifest available at https://raw.githubusercontent.com/elastic/elastic-agent/8.3/deploy/kubernetes/elastic-agent-standalone-kubernetes.yaml which will include templates that are compatible with 8.3 version. This is a convention that will handle in time of construction.
Regarding scaling, since we will only be including the "latest" compatible packages at the moment we talk about ~150 package and hence ~150 input templates. While we are scaling we can consider selection options but we don't foresee any crucial blocker here.
Having said this, since we agree on taking the safest approach we can consider this issue as "stalled" for now and close it soon if we have the CI's approach moving forward :).
Thank you! As all teams are unblocked and no issues with performance sure we can go with above: [For all to be synced] proposal is: CI construction of templates and place those under https://raw.githubusercontent.com/elastic/elastic-agent/8.3/deploy/kubernetes/elastic-agent-standalone-kubernetes.yaml
We will need this issue to track any work that might needed in Fleet UI to update manifests etc after templates are done
@gizas https://github.com/elastic/elastic-agent/issues/613 seems completed. Do we still need this one for any reason?
As discussed at https://github.com/elastic/elastic-agent/issues/613#issuecomment-1165373934, Fleet UI can be enhanced in order to provide specific input templates which will be capable to be enabled and populated by hint's based autodiscovery implemented in kubernetes provider.
Fleet UI should be capable to produce an
inputs.d
ConfigMap
like the following:So the flow for creating this new
ConfigMap
is like this:${kubernetes.hints.redis.info.host}"
. The fallback of this should be the default value so the final value of the setting is like${kubernetes.hints.redis.info.host|'127.0.0.1:6379'}"
. b. for every data_stream in the config block we add the proper condition so as this to be enabled only by the hint mechanism:condition: ${kubernetes.hints.redis.key.enabled} == true
The purpose of this
ConfigMap
will be to be mounted at elastic-agent-standalone/elastic-agent-standalone-daemonset-configmap.yaml manually as well as to be included in the full manifest that Fleet UI constructs implemented by https://github.com/elastic/kibana/pull/114439.This is related to https://github.com/elastic/elastic-agent/issues/662.