fhir-notification design can lead to false notifications for failed transactions

Describe the bug The current fhir-notification engine is built on top of our FHIRPersistenceInterceptor mechanism. Specifically, we emit the events when we invoke the afterX methods.

The problem with this is that those afterX methods are called before the transaction is committed. Therefor, its still possible for the transaction to fail after the event is fired.

Environment main

To Reproduce Steps to reproduce the behavior:

enable notifications (any impl)
force a transaction to fail AFTER getInterceptorMgr().fireAfterDeleteEvent(event); is called

note that subscibers still receive the event even though the create/update/delete change wasn't committed

Expected behavior events are only fired for interactions that complete successfully

Additional context I think it would be better to fire the events AFTER the transaction is committed. Unfortunately, that means that (unless we change the design more significantly) this could lead to a greater chance of missing events (e.g. event submission fails after transaction commits), but missing events seems much better than erroneous ones to me.

This could take one of a few flavors: A. keep it based on FHIRPersistenceInterceptor but add new afterX flavors that fire after the transaction commits B. make it work more like audit events which are now fired after transaction completion

Note: Paul mentioned that the current implementation is already using "commit async" to fire the events on NATS/Kafka, and so we already have the potential for failed submission of events.

Alternatively, we could move the event notification completely out of the request processing flow and fire them based on polling the history table/endpoints (or something like that).

Potentially related, but different, from https://github.com/IBM/FHIR/issues/1957

LinuxForHealth / FHIR

fhir-notification design can lead to false notifications for failed transactions #3071