Closed kpollich closed 3 months ago
Pinging @elastic/fleet (Team:Fleet)
Should we support multiple installations of the same integration? e.g. will there ever be two CSPM policies, or can we expect that if a user tries to install a second instance of CSPM they won't be able to do this?
Yes, users today have multiple CSPM integrations installed (on multiple agent policies). I don't think we don't want to limit the number of integrations installed to begin with.
Thanks, @olegsu - could we try to clarify the expected workflow then in the UI? Based on our conversations, I am understanding it as something like
Add CSPM
(or another agentless integration)agentless-1234
) is created under the hood, and the integration policy is attached to this agent policySome questions I have for the expected state after this flow is completed:
agentless-4567
) to hold that instance of CSPM? @kpollich you're on point.
Does the user "see" this new policy we created, or is it hidden from them?
That's actually a product question @tinnytintin10
Can the user add additional integrations to this policy in some way?
I think not.
If the user clicks "Add CSPM" again, would they create another new policy (e.g. agentless-4567) to hold that instance of CSPM?
I think yes.
On serverless, we have a few users that already use agentless, which uses the method of a preconfigured agent policy behind the scenes. Since we have a higher goal to support agentless sooner on ESS, without using the preconfigured method. It might save us time to migrate serverless to the new method later in the process. And first complete ESS.
So on fleet, we should be able to distinguish between these two implementation details.
This is still an open question, and it will be answered more effectively once we have a more detailed action plan and time estimations.
@kpollich I am wondering if the create a Kibana Agentless service in the Fleet plugin to create Agentless-ElasticAgents plan item for ESS is mostly a duplicate of this issue.
Assuming that we are doing what @kfirpeled mentioned and concentrating on ESS, the only difference for ESS would be that we are sending requests to the Agentless-API to create the new agentless-agent.
I don't think this is a duplicate of that issue. This issue seeks to capture the flow in which an integration policy is created for an integration with deployment_modes.agentless.enabled: true
. When this integration policy is created, we should also create a new agent policy record as the parent of the integration policy.
Whether this logic is encapsulated in the agentless service you're proposing is perhaps a relevant implementation detail, but actually calling that logic from the UI remains the unit of work captured by this issue.
See proposed workflow in this comment above: https://github.com/elastic/kibana/issues/183863#issuecomment-2122440189
The way this works today, the UI makes several sequential API calls to the /api/fleet
API namespace. When writing this issue, I was assuming this would continue to be the case, and we'd have some conditional logic with different calls/parameters based on agentless/non-agentless integration requirements.
@kpollich please correct me if I am not summarizing our conversation correctly.
@kpollich and I huddled to discuss if there could be a duplication of effort for this and what we needed to do for the Agentless API.
We agreed that 1 through 8 of the sequence diagram below would likely have overlapping concerns, while 9 is unique for the Agentless API.
We would likely need to move most or all of the Agentless Kibana logic from the CSPM plugin to the Fleet plugin to allow other integrations to use it.
If this issue's solution is to create a new Agentless-agent for each agentless supported integration, the work effort to integrate the Agentless API (ESS) would be considerably smaller.
We were unsure if there were CSPM plugin extension hooks for 10 through 16 and I am currently looking into that.
@seanrathier @kpollich The toggle will be for Agentless ESS correct? For Serverless, we have a pre-configured agentless policy once we migrate Agentless API to Serverless, then we can show the toggle.
@Omolola-Akinleye
The toggle will be for Agentless ESS correct? For Serverless, we have a pre-configured agentless policy once we migrate Agentless API to Serverless, we can show the toggle.
For Serverless, the toggle will still show, however, Agentless is selected by default and will only use the one pre-built Agentless policy. We need to give the user the option to use an Agent-based policy
For ESS, everything is the same as above, except when Agentless is selected it always creates a new agentless policy. You cannot select an existing Agentless policy like in Serverless.
@seanrathier Ahh okay thanks for clarifying. My concerns introducing toggle in Serverless
It is worth mentioning, that Agentless on Serverless will not be available before migrating it to use Agentless API and making all the flows similar to ESS.
What I miss in the ticket's DOD is in edit mode, that it cannot be switched/toggled. So agentless remains agentless and agent based remains agent-based.
@kpollich , tbh that value is part of the agent policy values. and not the integration. So if that is where this toggle will be visible, the edit mode should not concern us. Is that what you had in mind?
What I miss in the ticket's DOD is in edit mode, that it cannot be switched/toggled. So agentless remains agentless and agent based remains agent-based.
@kpollich , tbh that value is part of the agent policy values. and not the integration. So if that is where this toggle will be visible, the edit mode should not concern us. Is that what you had in mind?
If agent-based is selected, a new agent policy + integration policy will always be created. Should this be possible at all in the edit mode for an existing integration policy? e.g. if a user clicks "edit" for their Okta policy, which is running in agent-based mode, should they be able to toggle it to agentless mode and have a new agent policy created for them under the hood?
@kpollich - this looks great, thank you and your team for pushing this effort!
Given that agentless is going to be in beta for the first few releases we should make sure to:
We will be doing the same for the CSPM custom UI and have proposed a design for it - https://github.com/elastic/kibana/issues/188601. Can you please make sure to cover this here as well?
Both transitions are feasible.
We haven't captured that work yet on the first stages of our implementation, we were thinking of blocking it for now so we could focus on the creation flow.
It does add some complexity to edit an integration and switch to agentless. Under the hood we need to create a new agentless agent policy when doing so. As for the existing integration, should it be removed from the agent-based policy? should it be duplicated? I'm not quite sure..
we were thinking of blocking it for now so we could focus on the creation flow.
Let's move forward with this approach. Editing an agentless integration shouldn't be possible, and a new integration policy will need to be created. I'll capture this in the requirements above.
Updated the description + issue title to better reflect what we're doing here. cc @juliaElastic since you'll be picking this up next sprint as well. Let me know if you have any concerns about the larger list of requirements.
@kpollich A few questions here:
deployment_modes.agentless.enabled: true
set other than CSPM currently? I didn't see any other in elastic/integrations
repo.support_agentless: true
? From the comments it seems so, but in the description it seems to be allowed.Ensure that even if
supports_agentless: true
is set, the policy editor default to the current agent-based option and the toggle is set to agent-based
@juliaElastic I'll answer some of these questions as I worked on the agentless area previously.
Do we have any existing packages that have
deployment_modes.agentless.enabled: true
set other than CSPM currently? I didn't see any other in elastic/integrations repo.
AFAIK we don't have any. I generated a couple of packages locally with this flag to simulate this behavior. I added these packages it in this PR, but let me know if you cannot download them or if they don't work for you, I should still have them. The testing steps are in the PR description.
Do I understand correctly that we want to add this dropdown on the generic Fleet add integration UI where the package supports agentless?
Yes that's the idea, there was a lot of discussion on this but basically we want to be able to show a "reduced" version of that UI for a generic agentless integration, as the current form is custom-built for CSPM and is closely tied to its inputs.
Do we want to prevent adding more integration policies to an agent policy created with
support_agentless: true
?
I think that the assumption has been since the beginning that an "agentless" policy will only allow a single integration policy. I don't know if we want to put in place any explicit guards to prevent additional integrations policies from installing, but the relation should be 1:1
Also, related to this item in the description:
Always create a new agent policy when saving an integration policy in agentless mode. This newly created policy should have supports_agentless: true set.
I believe that this already happens, but we need to verify it.
I'll leave the other answers to @kpollich as I'm not sure what was decided about those.
Thanks, @criamico - your answers above look good.
Does this issue meant to change the UI in stateful mode only, leaving the serverless behaviour unchanged? Do we want to show the agentless selector inly in cloud or on-prem also?
The UI should be changed in both stateful and serverless. Users should be able to decide whether they want agentless or agent based deployment regardless of their environment.
Do I understand correctly that for now we don't want to allow switching from agent-based to agentless mode in edit integration page?
Yes we should prevent toggling an existing policy between deployment modes.
When the agentless option is selected, do we create the agent policy with the usual Fleet API or do we intend to use the Agentless API?
We'll just use the usual Fleet API - this doesn't need to interact with the agentless API at all for now. @seanrathier if I understand correctly we might change this later such that saving a new agentless policy would create and deploy an agentless agent via the agentless API (a mouthful!)
For now, so long as we are creating an integration policy with the proper variables set that belongs to a new agent policy with supports_agentless: true
, Fleet's job is complete here.
Do we want to prevent adding more integration policies to an agent policy created with support_agentless: true? From the comments it seems so, but in the description it seems to be allowed.
Yes, Cristina is correct this should not be possible. The goal is to have a single integration policy on each agent policy with supports_agentless: true
. This will provide the clearest scaling model for the agentless controller to create larger agents or enroll more agents based on that integration's ingest needs.
@kpollich I had a few more questions.
While implementing this, I see there is a lot of behavior now added for CSPM integration: setting agentless mode as default, calling agentless API to create agentless agent when creating agentless policy, only allow agentless selector if env is serverless/cloud with agentless api set. Do we want to keep the logic as-is for CSPM, and only implement the generic behavior for new non-CSPM packages supporting agentless? If all packages should work the same, I would have to remove a lot of logic recently added, not sure if this is what we want.
Also currently agentless policies are created as managed policies, which helps to prevent adding more integration policies. However managed policies don't allow deleting integration policies, so I think we have to either remove the managed property from the created agentless agent policies or let these integration policies be deleted even if the agent policy is managed.
How should we handle integrations with multiple policy templates like AWS? Should editing the "top-level" integration policy that renders all policy templates permit setting the deployment mode on a per-policy-template level?
I'm not sure how this could work, the agentless/agent-based selector determines whether the created agent policy will have support_agentless
flag or not. I think the safest option would be to only allow the selector if all policy templates support agentless in the current integration to be added.
Editing an agentless integration policy should not be possible. The "edit" action should be disabled with a tooltip directing the user to instead create a new integration policy and delete the existing one.
I started implementing this, and realised if we disable the edit action, users won't be able to view the current config of the integration policy (only in Agent policy / View policy as yaml). Alternatively we could allow going to the Edit integration page, and only disable the submit button.
While implementing this, I see there is a lot of behavior now added for CSPM integration: setting agentless mode as default, calling agentless API to create agentless agent when creating agentless policy, only allow agentless selector if env is serverless/cloud with agentless api set.
@juliaElastic @kpollich, I'm chiming in here for your knowledge. Most of the work we have completed and planned for the Agentless API (ESS) has been in the Fleet plugin, we want to avoid "customizations" for CSPM and make agentless available to all integrations.
We have not been making any changes in the CSPM plugin aside from the UI changes to Setup Technology Selector
, so that should be moved to the Fleet plugin whenever possible, and that might improve the performance of the page if we can avoid multiple React state renders.
Does that mean that the agentless API logic should be called for all integrations, not just CSPM?
And that the behavior described in this issue should be the same for all packages, in both serverless and cloud? (supposing agentless is not supported in on-prem deployments).
In serverless there was a hardcoded agent policy with id agentless
, are we getting rid of that too?
I can try to move out the setup technology selector out of the CSPM extension view to render with the generic logic, I added locally to the form like this, without the Advanced options
accordion, it looked out of place to be hidden under the accordion.
Does that mean that the agentless API logic should be called for all integrations, not just CSPM?
@juliaElastic, AFAIK, eventually all integrations that have the supports_agentless
property and the user chooses Agentless
option in the dropdown it should make the request to the Agentless API to create an agentless-agent.
In serverless there was a hardcoded agent policy with id agentless, are we getting rid of that too?
We have a task in our plan to migrate Serverless to use the Agentless API solution which will create a new Agentless-agent, instead of using the hard-coded agentless
policy.
Follow up on https://github.com/elastic/kibana/issues/180375
In https://github.com/elastic/kibana/issues/180375, Fleet added support for policy variables being shown/hidden based on the
deployment_modes
value in an integrations and ensured that policies saved for agentless integration policies are always saved with the hardcoded agent policy with IDagentless
.To finalize Fleet's generic support for integrations that opt into the agentless deployment mode, we need to do the following:
deployment_modes.agentless.enabled: true
is set on a given policy templatesupports_agentless: true
set.supports_agentless: true
is set, the policy editor default to the current agent-based option and the toggle is set toagent-based
agentless
option is clearly marked as beta, and display a beta banner in the policy editor form itself to reiterate this when agentless mode is enabledOpen questions