brexhq / substation

Substation is a toolkit for routing, normalizing, and enriching security event and audit logs.
https://substation.readme.io
MIT License
322 stars 16 forks source link

feat(transform): Add MetaRetry Transform #222

Closed jshlbrd closed 2 months ago

jshlbrd commented 2 months ago

Description

Motivation and Context

This is a better implementation of what was previously in the retry_with_backoff example and can be used to add strict guarantees when enriching data with an external source (such as a REST API or KV store). For example, this can be used to:

This transform will eventually return a limit exceeded error, which can be caught with the MetaErr transform if needed with the caveat that this can result in data loss for some transforms (only ones that I noticed are AggregateTo*; it does work for Send* transforms). The default behavior of the packaged applications is to crash on error, so I'm not too concerned about lossy transformation since:

How Has This Been Tested?

Here are more configs that can be tested with a simple JSON event (like {"a":"b"}):

Retry All Errors

This is retried three times and fails:

    sub.tf.meta.retry({
      transforms: [
        sub.tf.util.err({ message: 'test err'}),
      ],
      retry: { delay: '1s', count: 3, error_messages: [".*"] },
    }),

Retry Specific Errors

This is not retried and fails:

    sub.tf.meta.retry({
      transforms: [
        sub.tf.util.err({ message: 'test err'}),
      ],
      retry: { delay: '1s', count: 3, error_messages: ["^err"] },
    }),

Retry Aggregate Transform

This fails if the data is put into the Y aggregate array transform. The failure isn't known until a ctrl message is received, and the data put into the array is lost on retry.

    sub.tf.meta.retry({
      transforms: [
        sub.tf.meta.switch({ cases: [
          {
            transforms: [
              sub.tf.agg.to.arr({object: { target_key: 'x'}}),
            ],
            condition: sub.cnd.all([sub.cnd.utility.random()]),
          },
          {
            transforms: [
              sub.tf.agg.to.arr({object: { target_key: 'y'}}),
            ],
          }
        ]}),
      ],
      condition: sub.cnd.all([
        sub.cnd.num.len.gt({ object: { source_key: 'x'}, value: 0 }),
      ]),
      retry: { delay: '1s', count: 3 },
    }),

Retry Send Aux Transform

This retries forever until it succeeds. The failure isn't known until a ctrl message is received, and the data put into the send is not lost on retry.

    sub.tf.meta.retry({
      transforms: [
        sub.tf.send.stdout({ aux_tforms: [
          sub.tf.meta.switch({ cases: [{
            transforms: [
              sub.tf.util.err({ message: 'test err'}),
            ],
            condition: sub.cnd.all([sub.cnd.utility.random()]),
          }]})
        ]}),
      ],
      retry: { delay: '1s', error_messages: ['test err'] },
    }),

Types of changes

Checklist: