spring-projects / spring-framework

Spring Framework
https://spring.io/projects/spring-framework
Apache License 2.0
56.31k stars 38k forks source link

Messages arriving out-of-order despite setPreservePublishOrder(true) #27173

Closed osharaki closed 11 months ago

osharaki commented 3 years ago

setPreservePublishOrder(true) doesn't seem to guarantee that the client receives messages in the order of publication. I still need to account for server responses arriving out-of-order on the client side by checking for timestamps.

I noticed this issue while running integration tests for my STOMP over WebSocket and SockJS server where I would have up to 70 WebSocketStompClient SockJS clients connect and issue requests in quick succession. Although with a low number of clients setPreservePublishOrder(true) does seem to eliminate out-of-order messages, as the number of clients increases, the order is no longer guaranteed and client-side ordering via a response timestamp is needed.

The documentation makes no mention of possible shortcomings when the server is overloaded, so I'm assuming this behavior should not be happening.

I'm using version 2.4.5.

p4654545 commented 1 year ago

Hitting this exact problem with a simple controller which just sends values 0,1,2,...

First I added receipts according to https://github.com/spring-projects/spring-framework/issues/21848#issuecomment-453477550 That resulted in lost messages due to a race condition in OrderedMessageChannelDecorator.removeMessage(). The receipts are not sent through this decorator, so the ConcurrentWebSocketSessionDecorator.preSendCallback is called concurrently and as a result OrderedMessageChannelDecorator.removeMessage(). The race is in the the peek()/compare/remove sequence. Two invocations can see the right message and both will call messages.remove(). One of them removes the intended message, the second will unintentionally remove the next message. This is easily resolved by reducing the OrderedMessageChannelDecorator.removeMessage() implementation to "return this.messages.remove(message);"

Then the out of order messages remain. I have a single client and the magic seems to be to reduce the number of broker-channel threads to 2. I implemented WebSocketMessageBrokerConfigurer.configureMessageBroker and added the line

registry.configureBrokerChannel().taskExecutor().corePoolSize(2).maxPoolSize(2)

It will hit the problem 100% of the time. reducing the number of threads to 1 (core/max) gets rid of the problem.

Since I am unfamiliar with the code I have not been able to pinpoint the problem yet, but I suspect SendTask's are sent to the threadpool and the order of execution is not determined.

p4654545 commented 1 year ago

It seems the UserDestinationMessageHandler.handleMessage calls ExecutorSubscribableChannel.sendInternal, which just passes a SendTask to the executor (ThreadPoolTaskExecutor). The order of execution of the tasks passed to the threadpool is undefined. It also explains why limiting the number of threads in the brokerChannel works around the problem.

Summary: message order is not preserved with user destinations.

This callstack is from the broker-channel.

sendInternal:103, ExecutorSubscribableChannel (org.springframework.messaging.support) send:139, AbstractMessageChannel (org.springframework.messaging.support) send:125, AbstractMessageChannel (org.springframework.messaging.support) sendInternal:187, SimpMessagingTemplate (org.springframework.messaging.simp) doSend:162, SimpMessagingTemplate (org.springframework.messaging.simp) send:109, AbstractMessageSendingTemplate (org.springframework.messaging.core) handleMessage:217, UserDestinationMessageHandler (org.springframework.messaging.simp.user) run:144, ExecutorSubscribableChannel$SendTask (org.springframework.messaging.support)

p4654545 commented 1 year ago

Like @osharaki I ended up implementing message ordering on the application level. Basically a striped executor installed on the clientOutboundChannel. I use the (sessionId, destination) as the stripe-key and a native header with the message sequence number.

I still think the race condition is a bug and needs fixing.

The order preservation is a clear issue with user specific queues.

azhuchkov commented 1 year ago

Faced with the same issue. @p4654545 sorry, but from my understanding clientOutboudChannel is already wrapped with OrderedMessageChannelDecorator when PreservePublishOrder flag is set. The problem I see is that brokerChannel may change order of messages before sending them to the decorated clientOutboudChannel, since it submits SendTasks to its thread pool without any synchonization involved. So, if you set size of the thread pool to, say, 10 threads, it's quite easy to get messages reordering.

@rstoyanchev could you please shed some light on this?

p4654545 commented 1 year ago

OrderedMessageChannelDecorator has a race condition when there are messages sent on the outboundchannel without using the decorator. I implemented a stripedexecutor using sessionid and a stripeID as the key. Works wonderfully. We are in the process of moving to HTTP2 and Server Sent Events now, as that fits our use case better.

nklymok commented 1 year ago

Greetings! Is there any update on this? Hitting the same problem, unfortunately. Seems like implementing a striped executor is the only way to go if we want to handle it completely server-side. The other solution that we are most probably implementing is indexing the events, so the client can figure out the correct order themselves.

rstoyanchev commented 12 months ago

SimpleBrokerMessageHandler and StompBrokerRelayMessageHandler are generally the only ones that write to the clientOutboundChannel and they wrap it with OrderedMessageChannelDecorator. Others like annotated controllers and UserDestinationMessageHandler send messages through the broker, which then works out the subscribers. There is a diagram of this in the reference docs.

That said, @azhuchkov I think you're right that messages sent from UserDestinationMessageHandler to the broker are not ordered. We are now talking about preserving the order of messages coming from the client, in this case passing through the clientInboundChannel first, then handled by UserDestinationMessageHandler and sent to the brokerChannel. The goal of the original request was only to preserve the order of messages from the broker handler to clients.

We have #21798 scheduled to apply the same decorator to the clientInboundChannel. I suspect that will help as it will ensure that UserDestinationMessageHandler handles messages sequentially by session. However, from there messages can still get out of order before they get to the broker as the pass through the brokerChannel. So as part of that issue, we'll need to consider ordering for the brokerChannel too.

I will keep this open for now until #21798 is resolved, and we have confirmation it works.

rstoyanchev commented 11 months ago

After taking a closer look, we'll have a new StompEndpointRegistry#preserveReceiveOrder property that enables sequential handling on the clientInboundChannel as part of #21798, and the same property will also apply to the handling of messages with user destinations as part of #31395.

rstoyanchev commented 11 months ago

There is now 6.1.0-SNAPSHOT with the fixes for #21798 and #31395, and 6.1.0-RC1 is due tomorrow.

Messages handled in StompSubProtocolHandler and sent to clientInboundChannel, and likewise messages handled in UserDestinationMessageHandler and sent to the brokerChannel are now ordered. There is a new property to enable ordering for inbound messages, and it's covered in the docs.

If anyone is able to check that it works in their environment, that would be very helpful.