department-of-veterans-affairs / abd-vro

To get Veterans benefits in minutes, VRO software uses health evidence data to help fast track disability claims.
Other
19 stars 6 forks source link

Stopping rabbitmq container after long idle period kills 3 other containers #958

Closed sethdarragile6 closed 10 months ago

sethdarragile6 commented 1 year ago

User Story

As a VRO developer, I want my other containers to stay alive when rabbitmq goes down so that I can have a resilient system.\

Description

When the system is up but dormant for a few hours, killing the rabbitmq-service-1 will instantly kill the svc-pdf-generator-1, svc-assessor-dc7101-1 and svc-assessor-dc6602 containers. Restarting rabbitmq or app-1 will not bring them back up- they must all be manually started at this point. Operation appears normal thereafter.

Log output from these containers at this point are the same:

2023-01-18 16:10:59 2023-01-19 00:10:59 INFO Aborting transport connection: state=1; <socket.socket fd=6, family=AddressFamily.AF_INET, type=SocketKind.SOCK_STREAM, proto=6, laddr=('192.168.32.7', 51368)> 2023-01-18 16:10:59 2023-01-19 00:10:59 INFO _AsyncTransportBase._initate_abort(): Initiating abrupt asynchronous transport shutdown: state=1; error=None; <socket.socket fd=6, family=AddressFamily.AF_INET, type=SocketKind.SOCK_STREAM, proto=6, laddr=('192.168.32.7', 51368)> 2023-01-18 16:10:59 2023-01-19 00:10:59 INFO Deactivating transport: state=1; <socket.socket fd=6, family=AddressFamily.AF_INET, type=SocketKind.SOCK_STREAM, proto=6, laddr=('192.168.32.7', 51368)> 2023-01-18 16:10:59 2023-01-19 00:10:59 INFO AMQP stack terminated, failed to connect, or aborted: opened=True, error-arg=None; pending-error=ConnectionClosedByBroker: (320) "CONNECTION_FORCED - broker forced connection closure with reason 'shutdown'" 2023-01-18 16:10:59 2023-01-19 00:10:59 INFO Stack terminated due to ConnectionClosedByBroker: (320) "CONNECTION_FORCED - broker forced connection closure with reason 'shutdown'" 2023-01-18 16:10:59 2023-01-19 00:10:59 INFO Closing transport socket and unlinking: state=3; <socket.socket fd=6, family=AddressFamily.AF_INET, type=SocketKind.SOCK_STREAM, proto=6, laddr=('192.168.32.7', 51368)> 2023-01-18 16:10:59 2023-01-19 00:10:59 ERROR Unexpected connection close detected: ConnectionClosedByBroker: (320) "CONNECTION_FORCED - broker forced connection closure with reason 'shutdown'" 2023-01-18 16:10:59 2023-01-19 00:10:59 ERROR BlockingConnection.close(200, 'Normal shutdown') called on closed connection. 2023-01-18 16:10:59 Exception ignored in: <function RabbitMQConsumer.del at 0x7f853665ef80> 2023-01-18 16:10:59 Traceback (most recent call last): 2023-01-18 16:10:59 File "/home/docker/main_consumer.py", line 32, in del 2023-01-18 16:10:59 self.connection.close() 2023-01-18 16:10:59 File "/usr/local/lib/python3.10/site-packages/pika/adapters/blocking_connection.py", line 800, in close 2023-01-18 16:10:59 raise exceptions.ConnectionWrongStateError(msg) 2023-01-18 16:10:59 pika.exceptions.ConnectionWrongStateError: BlockingConnection.close(200, 'Normal shutdown') called on closed connection.

Steps to Reproduce

  1. With all containers up and running in a healthy state, walk away from your computer for a few hours. Go on, you've earned it.
  2. Okay that's enough, now get back to work and kill the rabbitmq-service-1 container from Docker Desktop
  3. Observe that svc-pdf-generator, svc-assessor-dc7101-1 and svc-assessor-dc6602-1 all go down with it instantly

Acceptance Criteria

  1. Bringing down rabbitmq-service-1 after a few dormant hours should not bring down any other containers

Related to #527

yoomlam commented 1 year ago

Is this reproducible?