counteractive / o365beat

Elastic Beat for fetching and shipping Office 365 audit events
Other
66 stars 27 forks source link

Parsing Extended Properties #41

Closed ion-storm closed 4 years ago

ion-storm commented 4 years ago

Can any parsing be done on the client side for this? I found that a regex within Graylog to remove ,[\r\n]+ "Value": " and replace with :" and [\r\n]+ "Value": " with a replacement with " and ,[\r\n]+ " with replacement with :" converts it proper json to break out the fields.

chris-counteractive commented 4 years ago

Great question, @ion-storm - the answer is "not yet" because we hadn't imported the script processor from libbeat until you brought this up.

I just pushed 9f1646f213e2a106b3e9a306d96f01833118e622 which imports that processor, and an example processor that does what you're asking in o365beat.dev.yml. In short, you can do the following:

processors:
  - script:
      when:
        or:
          - has_fields: ['Parameters']
          - has_fields: ['ExtendedProperties']
      lang: javascript
      id: name_value_array_parser
      source: >
        function process(event){
          var processed = event.Get('processed') || {};
          var parameters = event.Get('Parameters')
          if(!!parameters && !!parameters.length){
            processed.Parameters = processed.Parameters || {};
            for(var i = 0; i < parameters.length; i++){
              var p = parameters[i];
              if(p.Name) processed.Parameters[p.Name] = p.Value;
            }
          }
          var extendedProperties = event.Get('ExtendedProperties')
          if(!!extendedProperties && !!extendedProperties.length){
            processed.ExtendedProperties = processed.ExtendedProperties || {};
            for(var i = 0; i < extendedProperties.length; i++){
              var p = extendedProperties[i];
              if(p.Name) processed.ExtendedProperties[p.Name] = p.Value;
            }
          }
          event.Put('processed', processed);
        }

This will create a field called "processed" with sub-fields for Parameters and ExtendedProperties, both of which contain an array of name-value pairs. It loops through those pairs and uses the names as keys, so

"ExtendedProperties": [{"Name":"UserAgent","Value":"Mozilla/5.0 (Macintosh; Intel Mac OS X 10_14_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/79.0.3945.130 Safari/537.36"},{"Name":"UserAuthenticationMethod","Value":"12"},{"Name":"RequestType","Value":"OAuth2:Authorize"},{"Name":"ResultStatusDetail","Value":"Success"},{"Name":"KeepMeSignedIn","Value":"False"}]

becomes

"processed":{"ExtendedProperties":{"UserAuthenticationMethod":"12","RequestType":"OAuth2:Authorize","ResultStatusDetail":"Success","KeepMeSignedIn":"False","UserAgent":"Mozilla/5.0 (Macintosh; Intel Mac OS X 10_14_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/79.0.3945.130 Safari/537.36"}}

A few caveats:

I'll close this issue when I've rolled this into a release. Thank you for the issue!

chris-counteractive commented 4 years ago

FYI: the script processor is powerful but it only supports ecmascript 5.1 (via https://github.com/dop251/goja) so you don't get things like ES6 arrow functions or Array.forEach. Again, not sure about performance implications in your specific circumstance.

chris-counteractive commented 4 years ago

Also, working through this I noticed that when ExtendedProperties and Parameters are converted to strings using the convert processor, it doesn't serialize them into json - it gets close, but the string output is missing commas between objects in an array. We'll need better serialization there if people are going to try to parse those fields on the server side without undo hassle.

chris-counteractive commented 4 years ago

Included in release v1.5.1, along with docs in the README.