Open v8v5v opened 1 year ago
I feel as though we've generally been fairly strict in parsing formats, but it seems like ignoring the whitespace here would be a more pleasant UX. I'll pass this over to the appropriate team for consideration.
Thanks for the report!
@nabokihms @ktff just curious if either of you have thoughts here since you seem at least somewhat familiar with CEF 😄
That seams like a good idea. From what I remember only the whitespaces before the first key can cause a failure.
We can refer to this doc (Extension field section) https://www.microfocus.com/documentation/arcsight/arcsight-smartconnectors-8.3/cef-implementation-standard/#CEF/Chapter%201%20What%20is%20CEF.htm?TocPath=_____2
If there are multiple spaces before a key, all spaces but the last space are treated as trailing spaces in the prior value in the key. If you need trailing spaces, use multiple spaces, otherwise, use one space between the end of a value and the start of the following key.
Trailing spaces are not preserved for the final key-value pair in the extension. It is highly recommended to not utilize leading or trailing spaces in CEF events unless absolutely necessary. If that is the case, ensure the ordering of key-value pairs in the extension is such that any value with trailing spaces is not the final value. For more information on best practices for creating CEF events, see the CEF Mapping Guidelines document.
Extension values must follow the escape character guidelines defined for encoding symbols in CEF. See, Character Encoding.
As I understand it, we can safely trim all space characters before the first key of the extension field.
A note for the community
Problem
I was trying to parse Incapsula CEF log which look like:
(more examples could be found here)
The CEF parser could not parse this log message, probably because the last
|
sign has the trailing space. After I manually fixed the log by removing the space between the last|
and the followingfileid
, then it was parsed without any issue.It would be great to modify the CEF Parser and make it parse maybe non-standard logs which sometimes happen to exist in real life..
Configuration
Version
vector 0.29.1 (x86_64-unknown-linux-gnu 74ae15e 2023-04-20 14:50:42.739094536)
Debug Output
Example Data
CEF:0|Incapsula|SIEMintegration|1|1|Illegal Resource Access|3| fileid=3412341160002518171
Additional Context
No response
References
No response