Graylog2 / graylog2-server

Free and open log management
https://www.graylog.org
Other
7.33k stars 1.05k forks source link

select_jsonpath() can't escape . #5815

Open mustafaqasim opened 5 years ago

mustafaqasim commented 5 years ago

I'm trying to use only Pipelines to process logs from Zeek (Bro IDS) and has hit a dead end. The JSON log has key pairs that have dot "." in their name. The JSON log I get in $message.message is as following

 {
      "ts": "2019-03-27T23:21:39.126902Z",
      "uid": "CKzRBx1OsfFfclbL1i",
      "id.orig_h": "192.168.70.140",
      "id.orig_p": 54274,
      "id.resp_h": "216.58.200.100",
      "id.resp_p": 443,
      "proto": "tcp",
      "conn_state": "OTH",
      "local_orig": true,
      "local_resp": false,
      "missed_bytes": 0,
      "history": "C",
      "orig_pkts": 0,
      "orig_ip_bytes": 0,
      "resp_pkts": 0,
      "resp_ip_bytes": 0,
      "orig_l2_addr": "00:0c:29:0d:b2:c3",
      "resp_l2_addr": "00:50:56:f5:03:c8"
    }

I'm parsing that JSON blob with parse_json() and then picking each json object individually and placing into a new field. The nested JSON objects that have another . in them as id.orig_h will fail and won't work.

    rule "parse conn.log"
    when
        $message.nsm_log == "zeek_conn"
    then
        let tmp=parse_json(to_string($message.message));
        set_fields(select_jsonpath(tmp, {conn_uid:"$.uid"}));
        set_fields(select_jsonpath(tmp, {protocol:"$.proto"}));;
        set_fields(select_jsonpath(tmp, {src_ip:"$.id.orig_h"}));
        set_fields(select_jsonpath(tmp, {src_port:"$.id.orig_p"}));
        set_fields(select_jsonpath(tmp, {dest_ip:"$.id.resp_h"}));
        set_fields(select_jsonpath(tmp, {dest_port:"$.id.resp_p"}));
    end

Expected Behavior

I should be able to pick the object id.orig_h by escaping the . character.

Current Behavior

The particular set_fields() having such objects will just not work and those fields won't appear at all. Rest set_fields() works as expected.

Possible Solution

Enable select_jsonselect() to deal with such object having dot "." in their names by escaping it.

Steps to Reproduce (for bugs)

  1. Ingest JSON formatted zeek logs
  2. Use the above pipeline rules to tokenize the JSON blob in $message.message into individual fields.

Context

Zeek is a network security monitoring application which generates metadata logs on network connections.

Your Environment

philross88 commented 3 years ago

+1 to this issue. I am facing the same issue and seem to hit the roadblock.

the select_json path doesn't accept the valid expression. For e.g. to parse id.orig_h we can use $["id.orig_h"] however, the rule rejects double quote and single quote altogether on this. let new_fields = select_jsonpath(m, { ts: "$.ts", uid : "$.uid", id_orig_h: '$["id.orig_h"]' }

has someone figured this part out yet?