Open shaeqahmed opened 2 years ago
Hi @shaeqahmed , I believe the forward slashes inside the connection_date value need to be whitelisted in the grok pattern.
This seemed to work for me (in case you were blocked by this):
$ parse_groks!("user=john connect_date=11/08/2017 id=123 action=click", [s'%{data::keyvalue("=", "/:")}'])
{ "action": "click", "connect_date": "11/08/2017", "id": 123, "user": "john" }
However, your point about vector dropping all fields where datadog is only dropping the connect_date field is still valid.
It seems there are other cases where the vector behavior does not match that of datadog:
The datadog example of setting custom separator does not work as-is for vector:
parse_groks!("user: john connect_date: 11/08/2017 id: 123 action: click", [s'%{data::keyvalue(": ")}'])
But this modification to it does:
parse_groks!("user: john connect_date: 11/08/2017 id: 123 action: click", [s'%{data::keyvalue(": ", "/")}'])
Discovered another issue with parse_groks
:
> vector --version
vector 0.24.1 (x86_64-apple-darwin 8935681 2022-09-12)
> vector vrl
...
$ parse_groks!("4127 Register", ["%{NUMBER:.zeek.sip.sequence.number}"])
function call error for "parse_groks" at (0:70): unable to parse grok: value does not match any rule
$ parse_grok!("4127 Register", "%{NUMBER:.zeek.sip.sequence.number}")
{ ".zeek.sip.sequence.number": "4127" }
I believe parse_grok
was migrated internally to call parse_groks
with a single [pattern]
, so this seems like a bug for one to work and not the other. Also, isn't the parse_groks fn supposed to default to nested values, so this should return something like e.g.:
{ ".zeek.": { "sip": {"sequence": { "number": "4127" } } } }
@neuronull can you please take a look? Thanks!
Discovered another issue with
parse_groks
:@neuronull can you please take a look? Thanks!
Hi @shaeqahmed ! Thanks for flagging this... would you mind opening a new issue with these details and we will get that triaged?
I believe this original issue you filed still has standalone value for tracking this problem:
However, your point about vector dropping all fields where datadog is only dropping the connect_date field is still valid.
Gotcha, thanks I'll open a separate ticket
Hi, another examples of inconsistency:
vector -V
vector 0.41.1 (x86_64-unknown-linux-gnu 745babd 2024-09-11 14:55:36.802851761)
$ parse_grok!("/abc-arst-11323-arstars/err.txt", "/(?<test01>.[^/]*)/?")
{ "test01": "abc-arst-11323-arstars" }
$ parse_groks!("/abc-arst-11323-arstars/err.txt", patterns:[ "/(?<test01>.[^/]*)/?" ] )
function call error for "parse_groks" at (0:85): unable to parse grok: value does not match any rule
$ parse_groks!( "/123123/", patterns: [ "/%{NUMBER:nn:integer}/"] )
{ "nn": 123123 }
$ parse_grok!( "/123123/", "/%{NUMBER:nn:integer}/")
{ "nn:integer": "123123" }
A note for the community
Problem
Vector's parse_grok/parse_groks functionality seems to behave similar to Datadog's, but for a basic example copied from the docs, the result seems unxpected.
Log
Vector output (using VRL cli)
Datadog output
Expected output
Pretty sure this is a bug, as removing the
connected_date
field fixes the issue with the keyval parser, and dropping all the fields seems odd? I am new to VectorConfiguration
No response
Version
vector 0.22.2 (x86_64-apple-darwin 0024c92 2022-06-15)
Debug Output
No response
Example Data
user=john connect_date=11/08/2017 id=123 action=click
Additional Context
No response
References
No response