Open nabokihms opened 1 year ago
I don't know if we could imbed VRL in the encoder like that today, but it would be interesting to get that working.
Using VRL (if it is a problem) is not necessary. Instead, the feature can be implemented another way.
More optional fields:
encoding:
codec: cef
cef:
header: Vector # Optional, Vector by default
version: v0.29.0 # Optional, Vector version by default
name: <key to get event name> # Optional, cef.name by default
description: <key to get event description> # Optional, cef.description by default
severity: <key to get event severity> # Optional, cef.severity by default
field: [<array of fields to decode like for CSV codec>] # Optional
Example:
{
"message": "{\"@timestamp\": \"1683583245\",\"id\": \"ConsumerFetcherManager-1382721708341\",\"level\": \"info\",\"module\": \"kafka.consumer.ConsumerFetcherManager\",\"msg\": \"Stopping all fetchers\"}"
}
transforms:
cef_fields:
type: remap
source: |
. = parse_json!(.message)
.timestamp = ."@timestamp"
."cef.name" = "Security event"
."cef.description" = .msg
."cef.severity" = "1"
if .level == "warning" {
."cef.severity" = "4"
}
if .level == "error" {
."cef.severity" = "9"
}
sinks:
siem:
type: socket
inputs: ["cef_fields"]
encoding:
codec: "cef"
cef:
fields:
- id
- module
- timestamp
That will produce this line:
CEF:0|Vectordotdev|Vector|v0.29.0|Security Event|Stopping all fetchers|1|id=ConsumerFetcherManager-1382721708341 module=kafka.consumer.ConsumerFetcherManager timestamp=1683583245
My question is whether there is a chance for such a feature to be accepted or not. Do you know if it requires an RFC?
My question is whether there is a chance for such a feature to be accepted or not. Do you know if it requires an RFC?
I know we've discussed wanting to enable the use of VRL in more places, such as sink config/encoding config - but I'm not sure where we are on that today. I suspect we'd want an RFC for how that would work, which could be separate from an initial implementation of this codec.
@jszwedko what do you think, do you know if we'd still like to add VRL support to components outside of remap
? I know recently timestamp formatting has come up where encoding.timestamp_format
is more limited than what's possible in VRL - and using a VRL function there would be handy.
We discussed this today and think CEF is a reasonable addition to Vector's codec system alongside GELF, syslog, CSV, JSON, etc.. We did discuss that it would be nice to have a general pattern for sharing code between VRL and codecs since we have a few encoders/decoders supported in both places such as Syslog.
A vrl
codec would also be interesting (that is being tracked by https://github.com/vectordotdev/vector/issues/13634 and would probably require an RFC).
@jszwedko @spencergilbert what are my next steps to move this forward? Do I need to start with an RFC or dive straight into implementation?
@nabokihms if you wanted to implement the VRL based codec we'd want an RFC - if you were just implementing a CEF codec in the same style as the GELF/Syslog/etc going straight into the code is fine.
if you were just implementing a CEF codec in the same style as the GELF/Syslog/etc going straight into the code is fine.
@nabokihms I suggest you go with a "just CEF codec" way for this issue. It would be much easier to implement and still would be useful.
A note for the community
Use Cases
SIEM systems are everywhere. It is required to use CEF format to send events to many of them. Vector can efficiently collect data, and we use it as a central events processor, but it is not possible to send events to old-fashioned SIEM systems.
Attempted Solutions
VRL is an option but is not convenient enough. Users need to understand how to compose the right string.
Proposal
Add a new codec to encode messages into CEF format.
References
No response
Version
v0.29