openownership / register

A demonstration transnational register of beneficial ownership data from the UK, Denmark, Slovakia and Armenia
https://register.openownership.org
GNU Affero General Public License v3.0
18 stars 3 forks source link

PSC-STM-B2: Add failure handling for invalid stream pointer #250

Open tiredpixel opened 5 months ago

tiredpixel commented 5 months ago

If processing the Kinesis Stream and the stream pointer given is invalid, it should refetch the latest stream pointer per shard.

In practice, this should not happen unless the app has crashed and not run for 24 hours or so (in which case, the stream data it is trying to access would have expired), or alternatively if the stream has rebalanced its shards.

If the stream pointer being used is invalid, an error should be logged somewhere (eg Rollbar), as it is a sign some data may have been missed.

Estimate: 4 hours

tiredpixel commented 3 months ago

Similar to https://github.com/openownership/register/issues/243 , the approach I've chosen has been to log, clear the invalid data from Redis, and reraise. This will crash the app, but once it restarts, consuming from the stream should recover.

Note that this can happen per-shard, which could lead to more restarts than is necessary under this approach, but seeing as we're only using a single shard per stream, and that even if not, this should be fine, then I think that's okay for our present case.

Note also that there are two types of errors which can come from Kinesis, depending on whether the invalid pointer/sequence number has expired, or was never valid in the first place (or specifically, is being used on the wrong shard). Since AWS SDK doesn't raise errors nicely for Kinesis, these have been caught and reraised as custom exceptions based on matching the error message.

Both are reraised as RegisterCommon::Services::SequenceError, however.

Logging of errors will be handled in https://github.com/openownership/register/issues/253 .