Particular / ServiceInsight

Advanced debugging for NServiceBus
http://particular.net/serviceinsight
Other
16 stars 32 forks source link

Feature request: Ability to retry successful messages from audit queue #724

Open swettstein opened 7 years ago

swettstein commented 7 years ago

There are times we would like to retry a successful message as a way to smoke test a new deployment.

HEskandari commented 7 years ago

Thanks for raising this issue. We're evaluating ways to enhance 'debugging' aspects of ServiceInsight and the ability to edit and replay a message was brought up in that discussion.

Would you elaborate on 'smoke test a new deployment' bit? When changing a message, wouldn't you have to test it in dev environment instead? What are the scenarios that mandate you to test this later, e.g. after deployment in a staging environment (if that's what you meant). Would this be something that you'd rather do as a part of your development?

ping @swettstein

webprofusion-chrisc commented 6 years ago

I have an example whereby the message from an external system has been successfully processed, but the current version of the system has swallowed the contents. Being able to replay a set of messages (say within a time range and matching a type) would allow the production system to be fixed, then the missed messages replayed. Obviously this replay would need to be audited also. I can see it's fairly niche functionality.

WilliamBZA commented 6 years ago

We have this feature on our radar, but there's no sign (or guess even) of when it will get prioritized or implemented. There's also a fairly big discussion going on internally about the implications of replaying a message that has semantic meaning.

As an alternative in the meantime: You can implement a similar feature by using the audit forwarding capabilities in ServiceControl to forward all audit messages to an standalone "audit buffer" endpoint.

This endpoint will store those message for a period of time, and delete them after the time period has elapsed. If a message needs to be replayed, that endpoint can send the message as-is to the destination endpoint.

mikeminutillo commented 6 years ago

There are some complexities that you should watch out for if you're going to implement your own audit buffer for message replay:

webprofusion-chrisc commented 6 years ago

Yeah it sounds unlikely to foolproof either way. The original issue raised by @swettstein was the idea of taking some messages from a prod system and replaying them in UAT or testdev. My take on it was more if you have a steam driven mainframe in the basement which eventually pumps out messages (say, someone's mortgage contract or a bank account transfer), if you mishandled the message (i.e a few weeks ago) and replaying from source was going to be expensive/difficult then the ability to recreate and run a bunch of messages previously sent would be useful. Granted it's maybe a bit of a niche scenario.

swettstein commented 6 years ago

Another use case we have is when one message spawns a bunch of downstream messages and one of them gets corrupted. We'd like to be able to replay that initial entry message. We use GZIP to compress messages to comply with the 256k limit of Azure Service Bus and sometimes when we replay deadletter messages from Service Bus Explorer, it corrupts the compressed data.

Also of note: Our messages are all designed to be idempotent and we have checks in place to make sure newer data is not overwritten by older data.

fcastellscha commented 6 years ago

I would like to have this feature as well. My scenario is an endpoint listening to an event that, due to a bug in the logic that decided to either take an action or not, swallowed many messages without taking any action. After fixing the bug, replaying the messages would cause the action to be executed.

dikoga commented 6 years ago

Similar need. In my case, I have intersections with a third-party system that had an issue, swallowed all messages without take the correct action. They have asked if we could reply them.

How could I replay from the audit queue? Is this audit the one configured in SC?

pulverize commented 4 years ago

I'm interested in this functionality as well for production support. If an endpoint consumes a message and fails to process it correctly or completely, but does not throw an exception in the process, we would like very much to be able to replay that message. Even if the functionality was closer to "Clone and send" than "replay" - - being able to cause the contents of a message consumed by the audit queue to be reprocessed by an endpoint would add value we thought we had when making the purchase decision.

mikeminutillo commented 1 year ago

A customer expressed interest in this feature on the forum.

FabianTrottmann commented 1 year ago

Is there a workaround to do this, e.g. by modifying the state of the message in ravenDB?

boblangley commented 1 year ago

@FabianTrottmann No, audit messages are now stored in an entirely different database.