We need to retry quickly. However, there is a potential race condition for on-chain payments that sign as part of the Authorization step (such as Solana) if we retry too fast:
Authorize call 1 (signs)
Authorize call 2 starts after 1 begins but before it completes (signs)
Authorize call 1 finishes before 2
Pay 1 is called after Authorize 1 finishes, but before Authorize 2 finishes
Pay 2 is called after Authorize 2 finishes but before Pay 1 finishes
Double pay
In the current implementation, this risk is mitigated by proceeding to an Authorized state before generating a signed transaction, ensuring that retries go straight to Pay(). However, this creates another race condition which, while not a risk, results in undesirable failures:
Authorize call
Pay call before Authorize persists a signed transaction
Pay finds ExternalIdempotency data missing and fails immediately
While this doesn't present a double pay risk, it means that we fail unnecessarily when retrying fast. The following changes are needed to both avoid the race condition and the failure.
Do not proceed to the Authorized state until after the signed transaction is persisted to QLDB
Do not update ExternalIdempotency data in QLDB if it is already present
This means that in the case of the first race condition where two Authorize calls are racing, only the first one will have its signed transaction persisted into QLDB. In that case, a race between two Pay calls will be attempting to submit the same signed transaction and there is no risk of double payment.
We need to retry quickly. However, there is a potential race condition for on-chain payments that sign as part of the Authorization step (such as Solana) if we retry too fast:
In the current implementation, this risk is mitigated by proceeding to an
Authorized
state before generating a signed transaction, ensuring that retries go straight toPay()
. However, this creates another race condition which, while not a risk, results in undesirable failures:ExternalIdempotency
data missing and fails immediatelyWhile this doesn't present a double pay risk, it means that we fail unnecessarily when retrying fast. The following changes are needed to both avoid the race condition and the failure.
Authorized
state until after the signed transaction is persisted to QLDBExternalIdempotency
data in QLDB if it is already presentThis means that in the case of the first race condition where two Authorize calls are racing, only the first one will have its signed transaction persisted into QLDB. In that case, a race between two Pay calls will be attempting to submit the same signed transaction and there is no risk of double payment.