Particular / NServiceBus.RabbitMQ

RabbitMQ transport for NServiceBus
https://docs.particular.net/nservicebus/rabbitmq/
Other
88 stars 57 forks source link

Return addresses with RabbitMQ sitting on separate box #22

Closed peuramaki closed 10 years ago

peuramaki commented 10 years ago

I have a NSB setup using RabbitMQ transport, and I'm running Rabbit on separate server (will be RabbitMQ cluster in production).

It looks like that message return addresses are set incorrectly in this scenario.

  1. Service A sends a message to Service B
  2. Saga on Service B is started
  3. Service B does it's thing (long-running)
  4. Service B calls Saga.ReplyToOriginator

In this case the originator of the saga points at queue.NSBMachineA@NSBMachineA but I think it should point at queue.NSBMachineA@RabbitMQMachine

Currently, the reply never arrives to ServiceA (it does, if I run Rabbit on the same box than NSB).

andreasohlund commented 10 years ago

How are you hosting this?

Just a hunch but can you force UseSingleBrokerQueue to true?

Configure.ScaleOut(s=>s.UseSingleBrokerQueue());

peuramaki commented 10 years ago

I'm hosting both services with NserviceBus.Host. ServiceA is hosted AsA_Client and ServiceB is hosted AsA_Server. Service A is actually an integration test project. Both are services are sitting on the same box, RabbitMQ is on separate box.

Configuring services to use single broker queue has no effect on behavior.

Currently, I'm working around the issue by

This works alright for now, since I can do everything using interceptors.

peuramaki commented 10 years ago

Apparently, the timeout messages get wrong machine name as well. I tracked down the issue to TimeoutManager.Initialize(). DispacherAddress always gets the value on the processing machine, not the queue machine.

I couldn't figure out a workaround not involving reflection - I'm overriding TimeoutManager.DispacherAddress by force.

andreasohlund commented 10 years ago

Can you share your message mappings from config? (something spound vierd, the rabbit transport should ignore all machine names if UseSingleBrokerQueue is true)

peuramaki commented 10 years ago

Sure. These are on service B:

    <MessageEndpointMappings>
      <add Messages="System.Storage.Messages.V1" Endpoint="Customer.Storage@server01.localdomain" />
      <add Messages="System.Config.Messages.V1" Endpoint="Customer.Storage@server01.localdomain" />
    </MessageEndpointMappings>

and this is on service A (integration test client)

    <MessageEndpointMappings>
      <add Messages="Customer.Storage.Adapter.Messages.V1" Endpoint="Customer.Storage.Adapter.IntegrationTest@server01.localdomain" />
    </MessageEndpointMappings>

Afaik, you just described the problem - ignoring machine name even if I have set it. The name of the processing machine always pops up.

andreasohlund commented 10 years ago

Now I'm confused. Machine names are ignored since all endpoints connect to the same broker which you specify in the connectionstring?

What is it your trying to acheive by putting machine names in the message mappings?

Sent from my iPhone

On 19 mar 2014, at 16:45, peuramaki notifications@github.com wrote:

Sure. These are on service B:

<MessageEndpointMappings>
  <add Messages="System.Storage.Messages.V1" Endpoint="Customer.Storage@server01.localdomain" />
  <add Messages="System.Config.Messages.V1" Endpoint="Customer.Storage@server01.localdomain" />
</MessageEndpointMappings>

and this is on service A (integration test client)

<MessageEndpointMappings>
  <add Messages="Customer.Storage.Adapter.Messages.V1" Endpoint="Customer.Storage.Adapter.IntegrationTest@server01.localdomain" />
</MessageEndpointMappings>

Afaik, you just described the problem - ignoring machine name even if I have set it. The name of the processing machine always pops up.

— Reply to this email directly or view it on GitHub.

peuramaki commented 10 years ago

Why, I'm trying to force NSB to use my RabbitMQ box instead of NSB box of course, using the approach of trying anything. It's not working though, behavior is the same with or without the specified queue server.

If you take a look at TimeoutManager.Initialize()

DispatcherAddress = Address.Parse(Configure.EndpointName).SubScope("TimeoutsDispatcher");

the result always points to local box, not rabbit box defined in transport config. I haven't been able to work around this. There's a promisingly named method

Address.OverrideDefaultMachine(rabbitMqHost);

but it doesn't seem to have any effect either.

andreasohlund commented 10 years ago

I think I'm missing something, the way to specify which broker you connect to through the "NServiceBus/Transport" connection string?

https://github.com/Particular/NServiceBus.RabbitMQ.Samples/blob/master/VideoStore.RabbitMQ/VideoStore.Sales/App.config#L11

Is that not working for you?

On Thu, Mar 20, 2014 at 7:29 AM, peuramaki notifications@github.com wrote:

Why, I'm trying to force NSB to use my RabbitMQ box instead of NSB box of course, using the approach of trying anything. It's not working though, behavior is the same with or without the specified queue server.

If you take a look at TimeoutManager.Initialize()

DispatcherAddress = Address.Parse(Configure.EndpointName).SubScope("TimeoutsDispatcher");

the result always points to local box, not rabbit box defined in transport config. I haven't been able to work around this. There's a promisingly named method

Address.OverrideDefaultMachine(rabbitMqHost);

but it doesn't seem to have any effect either.

Reply to this email directly or view it on GitHubhttps://github.com/Particular/NServiceBus.RabbitMQ/issues/22#issuecomment-38138693 .

peuramaki commented 10 years ago

Yes, I have defined the transport connection string

<add name="NServiceBus/Transport" connectionString="host=rabbitmq01.localdomain" />

And this is working out for me, excecpt when

And I'm not using explicit timeouts in the saga with problems, NSB seems to use them sometimes (I'm not sure on what conditions). The problem is non-deterministic: sometimes it works, sometimes it doesn't.

andreasohlund commented 10 years ago

Any chance you can create a little sample project I can run to expose this?

On Thu, Mar 20, 2014 at 8:22 AM, peuramaki notifications@github.com wrote:

Yes, I have defined the transport connection string

And this is working out for me, excecpt when

  • I use Saga.ResponseToOrginator
  • Timeouts are being used
  • SLR's are being used

And I'm not using explicit timeouts in the saga with problems, NSB seems to use them sometimes (I'm not sure on what conditions). The problem is non-deterministic: sometimes it works, sometimes it doesn't.

Reply to this email directly or view it on GitHubhttps://github.com/Particular/NServiceBus.RabbitMQ/issues/22#issuecomment-38140753 .

peuramaki commented 10 years ago

I wrote a little project that should reproduce the issue but it doesn't..

I've done some more debugging though. Apparently, using rabbit transport and UseSingleBrokerQueue=true, NServiceBus.Address.Machine never gets used. Instead, address to send messages is always defined in rabbit channel that gets it from NServiceBus/Transport configuration. Am I correct?

Current suspect for the cause of this issue is RabbitMqUnitOfWork - sometimes message sending actions do get added to UoW, but are never run.

andreasohlund commented 10 years ago

I've done some more debugging though. Apparently, using rabbit transport and UseSingleBrokerQueue=true, NServiceBus.Address.Machine never gets used. Instead, address to send messages is always defined in rabbit channel that gets it from NServiceBus/Transport configuration. Am I correct?

That is correct!

Current suspect for the cause of this issue is RabbitMqUnitOfWork - sometimes message sending actions do get added to UoW, but are never run.

This sounds vierd, possibly a bug!

On Thu, Mar 20, 2014 at 2:35 PM, peuramaki notifications@github.com wrote:

I wrote a little project that should reproduce the issue but it doesn't..

I've done some more debugging though. Apparently, using rabbit transport and UseSingleBrokerQueue=true, NServiceBus.Address.Machine never gets used. Instead, address to send messages is always defined in rabbit channel that gets it from NServiceBus/Transport configuration. Am I correct?

Current suspect for the cause of this issue is RabbitMqUnitOfWork - sometimes message sending actions do get added to UoW, but are never run.

Reply to this email directly or view it on GitHubhttps://github.com/Particular/NServiceBus.RabbitMQ/issues/22#issuecomment-38166859 .

peuramaki commented 10 years ago

Right. I was able to reproduce and "fix" the issue. Turned out that queries to SQL Server in sagas somehow messed up NSB transactions - sometimes transactions were never completed.

Apparently, NServiceBus.RabbitMQ sends messages when Transaction.TransactionCompleted event is raised. If transaction is not committed, messages are not sent. Probably works like it should, but an exception would be nice to get.

"Fixing" the issue involved changing depencency lifecycle of my SQL client component to 'single instance' - using 'instance per unit of work' leads to non-deterministic behavior. SqlConnection is created in components constructor and disposed in IDispose.Dispose().

The issue is reproduced at https://github.com/peuramaki/NServiceBus.RabbitMq.Issue22, please take a look.

To be honest, I don't know what actually causes the problem. Insight appreciated ;-)

andreasohlund commented 10 years ago

Yes we delay the sending until the tx completes to make it easier for you to handle the lack of DTC. Ie it avoid "ghost" messages to be published in case of a db rollback. To avoid that behaviour you can make sure that there no TransactionScope wrapping the handler by calling:

https://github.com/Particular/NServiceBus/blob/9a6bc4e513bb8082f46a3652b9a037f3eba83e50/src/NServiceBus.Core/Settings/TransactionSettings.cs#L116

"Fixing" the issue involved changing depencency lifecycle of my SQL client component to 'single instance' - using 'instance per unit of work' leads to non-deterministic behavior. SqlConnection is created in components constructor and disposed in IDispose.Dispose().

What container are you using?

peuramaki commented 10 years ago

The delaying thingy makes perfect sense. I'd just want to get exception if something goes wrong.

I need to have transactional message handlers, suppressing transactions is not an option.

I'm using default Autofac, but NHIbernate for saga persistence.

peuramaki commented 10 years ago

I've continued investigations further with the issue.

It appears that when custom SQL queries to MS SQL Server (using SqlConnection) are made during the lifetime of a NServiceBus message handler, non-deterministic behavior is resulted. Sometimes ambient transaction is never committed, which results in NServiceBus.RabbitMQ never sending out messages that should be sent. That is because flushing of NSB's unit of work is bound to TransactionCompleted event, which sometimes never launhes.

I'm able to work around the issue the following way: instead of letting SqlConnection to automatically enlist to ambient transaction, I'm preventing it using 'Enlist=false' in connection string. I don't use System.Transactions at all with the SqlConnection, but I hook up SqlTransaction using NServiceBus'es IManageUnitsOfWork interface.

Maybe I should close this bug and open up a new one?

andreasohlund commented 10 years ago

Yes, please reopen another one!

Thanks!!

On Wed, Mar 26, 2014 at 12:54 PM, peuramaki notifications@github.comwrote:

I've continued investigations further with the issue.

It appears that when custom SQL queries to MS SQL Server (using SqlConnection) are made during the lifetime of a NServiceBus message handler, non-deterministic behavior is resulted. Sometimes ambient transaction is never committed, which results in NServiceBus.RabbitMQ never sending out messages that should be sent. That is because flushing of NSB's unit of work is bound to TransactionCompleted event, which sometimes never launhes.

I'm able to work around the issue the following way: instead of letting SqlConnection to automatically enlist to ambient transaction, I'm preventing it using 'Enlist=false' in connection string. I don't use System.Transactions at all with the SqlConnection, but I hook up SqlTransaction using NServiceBus'es IManageUnitsOfWork interface.

Maybe I should close this bug and open up a new one?

Reply to this email directly or view it on GitHubhttps://github.com/Particular/NServiceBus.RabbitMQ/issues/22#issuecomment-38674858 .

peuramaki commented 10 years ago

Closing this one, see https://github.com/Particular/NServiceBus.RabbitMQ/issues/26