rabbitmq / rabbitmq-web-stomp

Provides support for STOMP over WebSockets
Other
89 stars 26 forks source link

Re-subscription to an auto-delete queue can fail #94

Closed roman-holovin closed 5 years ago

roman-holovin commented 5 years ago

Use case: Subscribe to a queue endpoint /queue/<name> with arguments durable: false, auto-delete: true. Unsubscribe from it later and subscribe again with same arguments.

Expected result: 1) Queue gets created on first subscription since it doesn't exists 2) Queue gets deleted on unsubscription since it has auto-delete: true 3) Queue gets created again on re-subscription since it was deleted and doesn't exists

Actual result: 1) Queue gets created on first subscription since it doesn't exists 2) Queue gets deleted on unsubscription since it has auto-delete: true 3) rabbitmq gives an error that queue is not found

Websocket packet dump with reproduced issue:

CONNECT
login:<login>
passcode:<passcode>
accept-version:1.2,1.1,1.0
heart-beat:10000,10000

CONNECTED
server:RabbitMQ/3.7.7
session:session-EGnxTZmrJCsYQq8fMDcH5Q
heart-beat:10000,10000
version:1.2

SUBSCRIBE
durable:false
auto-delete:true
id:sub-0
destination:/queue/<name>

MESSAGE
subscription:sub-0
destination:/queue/<name>
message-id:T_sub-0@@session-EGnxTZmrJCsYQq8fMDcH5Q@@1
redelivered:false
content-length:<content-length>

<payload>

UNSUBSCRIBE
id:sub-0

SUBSCRIBE
durable:false
auto-delete:true
id:sub-1
destination:/queue/<name>

ERROR
message:not_found
content-type:text/plain
version:1.0,1.1,1.2
content-length:71

NOT_FOUND - no queue '<name>' in vhost '<vhost>'
michaelklishin commented 5 years ago

Auto-delete and exclusive queues deletion does not happen instantly (this is more obvious with multiple queues per connection) and client operations can enter a natural race condition. This is most common to see with automatic connection recovery. We recently introduced recovery operation retries in the Java client since there is no easily solution for this.

However, 3.7.8 unrolled a couple of optimizations that made this a lot more evident (https://github.com/rabbitmq/rabbitmq-server/pull/1691 is one example).

Temporary queues have a separate STOMP endpoint. We highly recommend using those since it sidesteps the problem entirely.

There are no timestamps in the traffic dump. Please post a script that can be used to reproduce to the mailing list and we will see if it's the same fundamental problem mentioned above or not.

michaelklishin commented 5 years ago

It's worth explaining that we don't consider it to be a "non-issue" and would try to reproduce either way (first with STOMP without the WebSocket layer, though). Our team uses GitHub issues as a tool that tracks actionable, well understood problems and at this stage this is mailing list material, even though @roman-holovin provided a reasonable amount of information to start the investigation.