Adds MetaRetry transform for retrying transform functions.
Added urlscan as an example of how the transform works with asynchronous REST APIs.
Motivation and Context
This is a better implementation of what was previously in the retry_with_backoff example and can be used to add strict guarantees when enriching data with an external source (such as a REST API or KV store). For example, this can be used to:
Retry forever until the transform produces an expected result based on a condition
Retry any number of times until the transform returns without error
Retry if an error occurs or a condition fails
This transform will eventually return a limit exceeded error, which can be caught with the MetaErr transform if needed with the caveat that this can result in data loss for some transforms (only ones that I noticed are AggregateTo*; it does work for Send* transforms). The default behavior of the packaged applications is to crash on error, so I'm not too concerned about lossy transformation since:
Errors propagate to the apps, which should trigger retry features in the producer service (e.g. AWS Kinesis, AWS S3)
Users have to opt into it (by using MetaErr)
This info will be documented on Readme
In future releases this may supersede retry strategies built into other transforms.
How Has This Been Tested?
Integration tested using new and updated examples.
Here are more configs that can be tested with a simple JSON event (like {"a":"b"}):
This fails if the data is put into the Y aggregate array transform. The failure isn't known until a ctrl message is received, and the data put into the array is lost on retry.
Description
MetaRetry
transform for retrying transform functions.Motivation and Context
This is a better implementation of what was previously in the
retry_with_backoff
example and can be used to add strict guarantees when enriching data with an external source (such as a REST API or KV store). For example, this can be used to:This transform will eventually return a limit exceeded error, which can be caught with the
MetaErr
transform if needed with the caveat that this can result in data loss for some transforms (only ones that I noticed areAggregateTo*
; it does work forSend*
transforms). The default behavior of the packaged applications is to crash on error, so I'm not too concerned about lossy transformation since:MetaErr
)This info will be documented on Readme
In future releases this may supersede retry strategies built into other transforms.
How Has This Been Tested?
Here are more configs that can be tested with a simple JSON event (like
{"a":"b"}
):Retry All Errors
This is retried three times and fails:
Retry Specific Errors
This is not retried and fails:
Retry Aggregate Transform
This fails if the data is put into the Y aggregate array transform. The failure isn't known until a ctrl message is received, and the data put into the array is lost on retry.
Retry Send Aux Transform
This retries forever until it succeeds. The failure isn't known until a ctrl message is received, and the data put into the send is not lost on retry.
Types of changes
Checklist: