Closed nimarezainia closed 1 year ago
Pinging @elastic/fleet (Feature:Fleet)
@jen-huang @nimarezainia Do we already have the design for this work?
@jlind23 Yes we do, link to the designs can be found in the product definition doc in parent issue of this one.
@jen-huang is the tech definition ready to be worked on in our next sprint?
@jlind23 I'm still going to work on it this week.
@jen-huang As you change this issue title to implement I believe the status should be changed to ready accordingly? Shall I also remove your assignment?
I'm currently looking at the schema validation for the new kafka type and it would be much cleaner if we moved from
/api/fleet/outputs
{
type: 'kafka'
...
}
to
/api/fleet/outputs/kafka
{
...
}
Of course we would keep the old endpoint for a few releases and mark it as deprecated. @kpollich suggested to redirect the code to the right path based on the type
property in the request body.
As discussed, will move this to Sprint 12 and continue the work there.
We will likely need to feature flag this as the Agent work will not be ready in the same release. However, we still want to enable customers to test SNAPSHOT builds of the agent once it is ready. I think we should use the "Advanced Settings" Kibana infra to do the feature flagging instead of kibana.yml settings to easily enable a customer to turn on this feature without having to reconfigure and restart Kibana.
@criamico this will be included in our next sprint. @joshdover had a great idea about first delivering the API experience to unblock users and in a second PR work on the UI part. Both should land in separate releases if needed. What do you think?
cc @juliaElastic
the API approach is fine as a first step. However what the users will need is the full UI capabilities. Also we would need to get some sample API calls that show the user how to configure Kafka in this case and think about how that would show up in the fleet UI with other outputs present.
In otherwords, if the user uses the API to create the output and configure it, what would the other users see in the Fleet UI?
@nimarezainia The API first approach has the benefit of unblocking Elastic Agent E2E tests with Kafka output, it does not necessarily imply that we should ship it to our users without any UI.
In otherwords, if the user uses the API to create the output and configure it, what would the other users see in the Fleet UI?
This is a good question. We'll still have to make a few UI adjustments to make sure this new output type doesn't break our existing UIs.
I would suggest just showing the output row for the Kafka outputs in the Settings tab, but disabling the edit button with a tooltip: "Use the Fleet API to edit this output"
Pinging @elastic/security-defend-workflows (Team:Defend Workflows)
Schema proposed changes:
none
option to compression
following docsclient_id
from required to optional since we provide default value following docpassword
is a required field if username
was provided, following docssasl.mechanism
is set to PLAIN
by default if username
and password
fields are set, following docpartition
from required to optional since we provide default value following docsbroker
section filled with optional timeout
and broker_timeout
hosts: string[]
version?: string // defaults to 1.0.0 by beats/agent if not set
key?: string
compression?: 'snappy' | 'lz4' | 'gzip' | 'none' // defaults to gzip
compression_level?: integer // only for gzip compression, defaults to 4
client_id?: string // default Elastic Agent
// authentication can done using:
// username/password, ssl, or kerberos
auth_type: 'user_pass' | 'ssl' | 'kerberos'
// auth: username/password
username?: string // must be present if auth_type === 'user_pass'
password?: string // must be present if username was provided
sasl.mechanism?: 'PLAIN' | 'SCRAM-SHA-256' | 'SCRAM-SHA-512' // defaults to `PLAIN` if username and password
// auth: ssl
ssl.certificate_authorities?: string
ssl.certificate?: string
ssl.key?: string
// auth: kerberos - should be marked as beta
// TBD: to check if we should do this as part of phase 1 if it is in beta
// partitioning settings
partition?: 'random' | 'round_robin' | 'hash' // defaults to 'hash'
random?.group_events?: integer // defaults to 1
round_robin>.group_events?: integer // defaults to 1
hash?.hash?: string
hash.random?: boolean // TBD: check the type of this field
// topics array
topics:
topic: string
when.type?: 'equals' | 'contains' | 'regexp' | 'range' | 'network' | 'has_fields' | 'or' | 'and' | 'not'
when.condition?: string
// headers array
headers?[]:
key: string
value: string
// broker
tiemout?: integer // defaults to 30
broker_timeout?: integer // defaults to 10
@kevinlog from a pure planning perspective, is it fair to say that the API should probably be available in the next release or so?
Adding here some more details about the definitions of done for the API work. They mirror the Logstash
integration tests in https://github.com/elastic/kibana/blob/main/x-pack/test/fleet_api_integration/apis/outputs/crud.ts
fleet-server
integration don't get get switched to the new outputfleet-server
integration don't get switched to the new output
Kafka output cannot be used with Fleet Server integration in Fleet Server Policy. Please create a new ElasticSearch output.
fleet-server
integration don't get switched to the new outputfleet-server
integration don't get get switched to the new outputNote that in this context default
refers tois_default
flag.
@szwarckonrad
@jlind23
Apologies for the late reply on this one: https://github.com/elastic/kibana/issues/143324#issuecomment-1584110688
I spoke with @szwarckonrad offline - we think the API should be available for testing in the main branch in the first half of the 8.10 cycle. He has a draft PR up here: https://github.com/elastic/kibana/pull/159110
For tracking purpose i'll link the draft PR to this issue. Thanks @kevinlog for the update.
@kevinlog Hey, QA Source team asked to do a demo with them when the UI feature is ready.
@amolnater-qasource By reading the comments the API work is planned for 8.10, I suppose the UI work after that.
Thank you for the confirmation @juliaElastic
We will keep track of the updates on this feature. Further as the feature is not going in 8.9, we will put the test content work on hold for now.
Thank you!
Thanks @juliaElastic @amolnater-qasource - we can do a demo when the UI is ready. @szwarckonrad is currently working on the API with the UI to follow.
@cmacknz Can you clarify if a kafka output can have a shipper section or if it's not currently possible? @szwarckonrad is finalizing his PR and can still do changes to that part
The Kafka output shouldn't have to explicitly account for the shipper. We are working through simplifying how the shipper configuration works:
https://github.com/elastic/elastic-agent/pull/2728#issuecomment-1583178038
We will use shipper.enabled: true as the syntax for enabling the shipper, but put all other configuration under the root output configuration. We will keep the shipper.* syntax as an escape hatch in case we find an unforeseen configuration conflict and need to move configuration under the shipper path.
The only thing that will be under the shipper configuration object going forward is the enabled
flag. Shipper development is temporarily paused so we haven't created issues to update the Fleet side of this yet.
@jlind23 @criamico @juliaElastic cc @kevinlog
I just wanted to reach out regarding the UI part of the task. While working on it, I came across a question about the Topics and Headers sections. They both are custom multirow components and even though the Hosts section uses MultiRowInput
component, it's designed to handle a single field. I was thinking about the best approach for implementing these new custom multi-row inputs, and I believe we have a couple of options to consider.
Firstly, we could consider refactoring the existing MultiRowInput to accommodate different types of children components. This way, it would be more versatile and meet the requirements for the problem at hand.
Alternatively, we could create two new custom components that are specifically tailored to fulfill all the requirements for the Topics and Headers sections.
What do you think would be the most suitable path forward?
@szwarckonrad sorry we missed your comment. @criamico could you please provide us guidance here? cc @juliaElastic
@jlind23 I have a draft PR up with complete UI, I went with creating two separate components for topics and headers. I believe we can move the discussion about them there ;) https://github.com/elastic/kibana/pull/160112
The API part of this work is completed, but It would be good to have some guidance about the way to handle the password
field. I'm wondering if there is any security concern here, as I think this is the first time that we handle any password in Fleet. @joshdover @jlind23 @juliaElastic @szwarckonrad
@criamico don't we already have password in some integrations just like here ?
The API part of this work is completed, but It would be good to have some guidance about the way to handle the password field. I'm wondering if there is any security concern here, as I think this is the first time that we handle any password in Fleet. @joshdover @jlind23 @juliaElastic @szwarckonrad
That a good catch we probably want to use encrypted saved object for the password
field like we do for the logstash client certificates for example
Thanks @jlind23 and @nchaulet, i didn't know we already had working examples of this. I think this is the way to go for the Kafka password then.
UI PR is open. Open questions:
kafka
option in the dropdown go behind a feature flag?Should the kafka option in the dropdown go behind a feature flag?
Adding it behind a feature and under a beta status would make more sense to me. @nimarezainia @juliaElastic thoughts?
Adding it behind a feature and under a beta status would make more sense to me.
+1, I think that this is what was requested at the beginning of this work
Why do we need the feature flag exactly? I think that will just make it harder for users to find the feature. Agree on beta for now though 👍
Then let's go for beta without feature flag 👍🏼
Where can I find the schema for the kafka output configuration that Fleet writes into the Agent policy?
My interest is in knowing the complete set of kafka settings that can be expected when receiving a policy (like what fields are required, optional, their data types, etc... basically looking for an openapi spec for the output section of the agent policy).
Thanks @juliaElastic. So is the Fleet API request format exactly the same as what is written into the Agent policy outputs
section? For example, if an API request comes in with ca_trusted_fingerprint: 79f956a0175
then the Agent policy would contain this or is there some translation?
outputs:
default:
type: kafka
ca_trusted_fingerprint: 79f956a0175
There is some translation, e.g. ca_trusted_fingerprint
gets a prefix of ssl.
: https://github.com/szwarckonrad/kibana/blob/main/x-pack/plugins/fleet/server/services/agent_policies/full_agent_policy.ts#L214
Perhaps a silly question, but is there any integration with the other end of the kafka? we have https://docs.elastic.co/integrations/kafka_log and if I gathered the tickets correctly we also have an opinionated default name for topics so it would be great if some option was given to deploy an agent that consumes the topics for which integrations are sending and uses the default datastreams / pipelines, to ease rollout
I have a question about topics and processors in general. It is unlikely that Endpoint will be able to support the full gamut of available processors, considering we do not have access to libbeat and would have to write all the parsing code from scratch in C++. Is there a minimum set of required processors that could maybe help us tone down the scope of what we will need to provide? cc: @nfritts @ferullo
ref: https://www.elastic.co/guide/en/beats/filebeat/current/defining-processors.html
@brian-mckinney I'm unsure how this ties into this issue specifically, but processors are generally there to edge processing typically it's to drop traffic / reduce payload, or collect additional (local context) information (add_*_metadata processors + dns). With kafka in the middle:
Source -> shipper(agent/endpoint?) -> Kafka -> forwarder(agent/filebeat) -> Elasticsearch
you still have the option to use all processors at the forwarder, though add_*_metadata processors aren't useful there as it would record things on the shipper, regardless it should improve things over the the current available options. And of source there's ingest pipelines for everything that doesn't require edge processing.
I think it's a separate issue, but things like decode_* can be skipped and parse_aws_vpc_flowlog which seems like a poorly named decode variant.
I have a question about topics and processors in general. It is unlikely that Endpoint will be able to support the full gamut of available processors, considering we do not have access to libbeat and would have to write all the parsing code from scratch in C++.
I fully agree that Endpoint should avoid re-implementing Beats processors. From my understanding, the Kafka output only makes use of the conditional processors for topic selection, which does pare down the list somewhat. But I wonder if we could ship the first version of this without support for dynamic topic selection at all and only support for the static topic
field? @nimarezainia
If/when we do want to support dynamic topic selection, I think we could omit some conditions, like network or contains (covered by regexp).
Perhaps a silly question, but is there any integration with the other end of the kafka? we have docs.elastic.co/integrations/kafka_log and if I gathered the tickets correctly we also have an opinionated default name for topics so it would be great if some option was given to deploy an agent that consumes the topics for which integrations are sending and uses the default datastreams / pipelines, to ease rollout
This is a good suggestion - but not considered at this point. It would make sense for our default and examples in docs to match up with what the Kafka input package expects.
I fully agree that Endpoint should avoid re-implementing Beats processors. From my understanding, the Kafka output only makes use of the conditional processors for topic selection, which does pare down the list somewhat. But I wonder if we could ship the first version of this without support for topic selection at all and only support for the static topic field? @nimarezainia
If/when we do want to support dynamic topic selection, I think we could omit some conditions, like network or contains (covered by regexp).
Thanks @joshdover. I'm very interested in the outcome of this discussion. Scoping out dynamic topic selection in the first version would definitely reduce the amount of effort and testing complexity (on Endpoint at least) for the first version.
I fully agree that Endpoint should avoid re-implementing Beats processors. From my understanding, the Kafka output only makes use of the conditional processors for topic selection, which does pare down the list somewhat. But I wonder if we could ship the first version of this without support for topic selection at all and only support for the static topic field? @nimarezainia
If/when we do want to support dynamic topic selection, I think we could omit some conditions, like network or contains (covered by regexp).
Thanks @joshdover. I'm very interested in the outcome of this discussion. Scoping out dynamic topic selection in the first version would definitely reduce the amount of effort and testing complexity (on Endpoint at least) for the first version.
@joshdover & @brian-mckinney dynamic topic selection is an attractive aspect of this solution. I have had a few customers engaging on that. However given where we are and the fact that this will be a beta to begin with, I think it's fair to address this as a followup. I will communicate this to our Beta candidates when the time comes.
Could someone please clarify what Authentication methods will be available in the first phase? (appreciate it thx)
@szwarckonrad Could you clarify which options we ended up implementing the UI for this first phase?
@nimarezainia I think we're also limited by what Endpoint ends up supporting, which is still in progress. @brian-mckinney should be able to help clarify this.
@joshdover Following the mockups I went with UI for username/password and SSL
Kafka output UI
Similar to Logstash output, we need to add the option for users to specify Kafka as an output option for their data. In 8.8, this UI will be hidden behind an experimental flag as the shipper portion is not ready until 8.9.
Tasks
kibana.yml
) supports itelastic-agent.yml
fields (most are the same, but there are a few differences due to needing information for UI)API
The output API should support a new output type:
kafka
.See Kafka Output type
This output type should have the following properties: ``` hosts[]: uri version?: string // defaults to 1.0.0 by beats/agent if not set key?: string compression?: 'snappy' | 'lz4' | 'gzip' // defaults to gzip compression_level?: integer // only for gzip compression, defaults to 4 client_id: string // authentication can done using: // username/password, ssl, or kerberos auth_type: 'user_pass' | 'ssl' | 'kerberos' // auth: username/password username?: string password?: string sasl.mechanism?: 'PLAIN' | 'SCRAM-SHA-256' | 'SCRAM-SHA-512' // auth: ssl ssl.certificate_authorities?: string ssl.certificate?: string ssl.key?: string // auth: kerberos - should be marked as beta // TBD: to check if we should do this as part of phase 1 if it is in beta // partitioning settings partition: 'random' | 'round_robin' | 'hash' // defaults to 'hash' random.group_events?: integer round_robin.group_events?: integer hash.hash?: string hash.random?: boolean // TBD: check the type of this field // topics array topics: topic: string when.type?: 'equals' | 'contains' | 'regexp' | 'range' | 'network' | 'has_fields' | 'or' | 'and' | 'not' when.condition?: string // headers array headers?: key: string value: string // broker ```
UI tasks
Specify the URLs that your agents will use to connect to Kafka. For more information, see the Fleet User Guide
Add row
button belowtopics[]
array (text input box)Elastic Agent
gzip
and compression level 4none
snappy
lz4
gzip
gzip
, also show field for compression levelDefine how long a Kafka server waits for data in the same cluster
Define how long an Agent would wait for a response from Kafka Broker
Define the number of messages buffered in output pipeline
Wait for local commit
Reliability level required from the broker
Wait for local commit
Wait for all replicas to commit
Do not wait
If configured, the event key can be extracted from the event using a format string
Designs
Open questions