zhmcclient / python-zhmcclient

A pure Python client library for the IBM Z HMC Web Services API
Apache License 2.0
39 stars 26 forks source link

Notification Receiver Close Command Not Working #572

Closed james-crowley closed 4 years ago

james-crowley commented 5 years ago

Actual behavior

When calling close() on a Notification Receiver Object, program just hangs. If I enable logging I can see the following.

2019-05-08 13:15:10.315 Thread-1:testing:INFO:Successfully create Receiver!
2019-05-08 13:15:10.316 Thread-1:_logging:DEBUG:==> NotificationReceiver.notifications(), args: (<zhmcclient._notification.NotificationReceiver object at 0x10d12dcc0>,), kwargs: {}
2019-05-08 13:15:10.316 Thread-1:_logging:DEBUG:<== NotificationReceiver.notifications(), result: <generator object notifications at 0x10d14e570>
2019-05-08 13:15:10.341 StompReceiverThread-4:transport:DEBUG:listener <zhmcclient._notification._NotificationListener object at 0x10d1350b8> has no method on_before_message
2019-05-08 13:15:10.341 StompReceiverThread-4:transport:DEBUG:Received frame: 'MESSAGE', headers={'expires': '0', 'global-sequence-nr': '1130', 'session-sequence-nr': '1', 'destination': '/topic/sysprog.123osm.e12121dac50331178e79c46bd3531b90', 'subscription': 'zhmcclient.4514307264', 'priority': '4', 'notification-type': 'os-message', 'name': 'LP41', 'message-id': 'ID:PETHMC1-42915-1555356921407-5:1:1:37:1', 'object-uri': '/api/logical-partitions/e12121da-c503-3117-8e79-c46bd3531b90', 'object-id': 'e12121da-c503-3117-8e79-c46bd3531b90', 'class': 'logical-partition', 'timestamp': '1557335754097'}, body='{"os-messages":[{"message-text":"Preparing system.\\n","message-id":"4067657","sequence-number":0,"os-name":null,"sound-alarm":false,"is-priority":false,"is-refresh":true,"prompt-text":"","is-held":false,"timestamp":-1},{"message-text":"Starting system.\\n","message-id":"4067659","sequence-number":1,"os-name":null,"sound-alarm":false,"is-priority":false,"is-refresh":true,"prompt-text":"","is-held":false,"timestamp":-1},{"message-text":"First boot loader version 1.19.2 start (Wed May  8 17:15:12 UTC 2019 D36C).\\n","message-id":"4067660","sequence-number":505,"os-name":null,"sound-alarm":false,"is-priority":false,"is-refresh":true,"prompt-text":"","is-held":false,"timestamp":-1},{"message-text":"Load Installer per override.\\nBooting \'BCInstaller\' image ...\\n","message-id":"4067671","sequence-number":507,"os-name":null,"sound-alarm":false,"is-priority":false,"is-refresh":true,"prompt-text":"","is-held":false,"timestamp":-1}]}'
2019-05-08 13:16:27.031 StompReceiverThread-4:transport:DEBUG:listener <zhmcclient._notification._NotificationListener object at 0x10d1350b8> has no method on_before_message
2019-05-08 13:16:27.031 StompReceiverThread-4:transport:DEBUG:Received frame: 'MESSAGE', headers={'expires': '0', 'global-sequence-nr': '1134', 'session-sequence-nr': '2', 'destination': '/topic/sysprog.123osm.e12121dac50331178e79c46bd3531b90', 'subscription': 'zhmcclient.4514307264', 'priority': '4', 'notification-type': 'os-message', 'name': 'LP41', 'message-id': 'ID:PETHMC1-42915-1555356921407-5:1:1:37:2', 'object-uri': '/api/logical-partitions/e12121da-c503-3117-8e79-c46bd3531b90', 'object-id': 'e12121da-c503-3117-8e79-c46bd3531b90', 'class': 'logical-partition', 'timestamp': '1557335830788'}, body='{"os-messages":[{"message-text":"Preparing system.\\n\\n","message-id":"0","sequence-number":508,"os-name":null,"sound-alarm":false,"is-priority":false,"is-refresh":false,"prompt-text":"","is-held":false,"timestamp":-1}]}'
2019-05-08 13:16:27.256 StompReceiverThread-4:transport:DEBUG:listener <zhmcclient._notification._NotificationListener object at 0x10d1350b8> has no method on_before_message
2019-05-08 13:16:27.256 StompReceiverThread-4:transport:DEBUG:Received frame: 'MESSAGE', headers={'expires': '0', 'global-sequence-nr': '1138', 'session-sequence-nr': '3', 'destination': '/topic/sysprog.123osm.e12121dac50331178e79c46bd3531b90', 'subscription': 'zhmcclient.4514307264', 'priority': '4', 'notification-type': 'os-message', 'name': 'LP41', 'message-id': 'ID:PETHMC1-42915-1555356921407-5:1:1:37:3', 'object-uri': '/api/logical-partitions/e12121da-c503-3117-8e79-c46bd3531b90', 'object-id': 'e12121da-c503-3117-8e79-c46bd3531b90', 'class': 'logical-partition', 'timestamp': '1557335831012'}, body='{"os-messages":[{"message-text":"Starting system.\\n\\n","message-id":"0","sequence-number":509,"os-name":null,"sound-alarm":false,"is-priority":false,"is-refresh":false,"prompt-text":"","is-held":false,"timestamp":-1}]}'
2019-05-08 13:16:30.157 StompReceiverThread-4:transport:DEBUG:listener <zhmcclient._notification._NotificationListener object at 0x10d1350b8> has no method on_before_message
2019-05-08 13:16:30.157 StompReceiverThread-4:transport:DEBUG:Received frame: 'MESSAGE', headers={'expires': '0', 'global-sequence-nr': '1142', 'session-sequence-nr': '4', 'destination': '/topic/sysprog.123osm.e12121dac50331178e79c46bd3531b90', 'subscription': 'zhmcclient.4514307264', 'priority': '4', 'notification-type': 'os-message', 'name': 'LP41', 'message-id': 'ID:PETHMC1-42915-1555356921407-5:1:1:37:4', 'object-uri': '/api/logical-partitions/e12121da-c503-3117-8e79-c46bd3531b90', 'object-id': 'e12121da-c503-3117-8e79-c46bd3531b90', 'class': 'logical-partition', 'timestamp': '1557335833914'}, body='{"os-messages":[{"message-text":"System version 8.\\n","message-id":"0","sequence-number":510,"os-name":null,"sound-alarm":false,"is-priority":false,"is-refresh":false,"prompt-text":"","is-held":false,"timestamp":-1}]}'
2019-05-08 13:16:30.382 StompReceiverThread-4:transport:DEBUG:listener <zhmcclient._notification._NotificationListener object at 0x10d1350b8> has no method on_before_message
2019-05-08 13:16:30.382 StompReceiverThread-4:transport:DEBUG:Received frame: 'MESSAGE', headers={'expires': '0', 'global-sequence-nr': '1146', 'session-sequence-nr': '5', 'destination': '/topic/sysprog.123osm.e12121dac50331178e79c46bd3531b90', 'subscription': 'zhmcclient.4514307264', 'priority': '4', 'notification-type': 'os-message', 'name': 'LP41', 'message-id': 'ID:PETHMC1-42915-1555356921407-5:1:1:37:5', 'object-uri': '/api/logical-partitions/e12121da-c503-3117-8e79-c46bd3531b90', 'object-id': 'e12121da-c503-3117-8e79-c46bd3531b90', 'class': 'logical-partition', 'timestamp': '1557335834138'}, body='{"os-messages":[{"message-text":"Watchdog enabled.\\nRunning \'BCInstaller\' level \'D36C.D36C_0002\'.\\n","message-id":"0","sequence-number":511,"os-name":null,"sound-alarm":false,"is-priority":false,"is-refresh":false,"prompt-text":"","is-held":false,"timestamp":-1}]}'
2019-05-08 13:16:36.753 StompReceiverThread-4:transport:DEBUG:listener <zhmcclient._notification._NotificationListener object at 0x10d1350b8> has no method on_before_message
2019-05-08 13:16:36.754 Thread-1:testing:INFO:Found a messages matching the phrase, The server is listening on: 10.20.81.73
2019-05-08 13:16:36.754 Thread-1:_logging:DEBUG:==> NotificationReceiver.close(), args: (<zhmcclient._notification.NotificationReceiver object at 0x10d12dcc0>,), kwargs: {}
2019-05-08 13:16:36.754 Thread-1:transport:DEBUG:Sending frame: [b'DISCONNECT', b'\n', b'receipt:2b7e3e68-a179-4416-ae40-541c1edc5f7b\n', b'\n', b'\x00']
2019-05-08 13:16:36.757 StompReceiverThread-4:transport:DEBUG:Received frame: 'RECEIPT', headers={'receipt-id': '2b7e3e68-a179-4416-ae40-541c1edc5f7b'}, body=''

It looks like it closed the connection but nothing happens. My program just sits there and can't continue on.

Expected behavior

When calling close() on a Notification Receiver Object, I expect the behavior to close the connection and hand back focus of the program to the main thread.

Execution environment

andy-maier commented 5 years ago

Hi James. I'll look into this.

andy-maier commented 5 years ago

Hi James. I tried to reproduce this by running the show_os_messages.py example program provided with the zhmcclient package, and was unable to reproduce it (on a MacBook with Python 2.7, using the master branch of zhmcclient).

Here is how I configured the parameters for the example program in hmccreds.yml (you can create that by copying it from the provided example_hmccreds.yml):

   show_os_messages:
      hmc: "172.16.64.15"
      cpcname: CPCA
      partname: MGMT1
      loglevel: debug
      logmodule: ""
      timestats: no

I modified the show_os_messages.py example program a little by moving the receiver.close() call from the except clause to behind the try-except block, and I added flush() calls after the prints because of the multi-threading, and added the thread name to the log format, but none of that should affect whether or not receiver.close() hangs. I will create a PR with these changes to the example program, because they generally make sense.

Here is the result of running the example program (starting with the last OS message it displays, so the closing is shown):

[943365.153224] 50-motd-news[38550]:      snap install microk8s --classic
^CKeyboard interrupt - leaving receiver loop
Closing receiver...
2019-05-12 11:33:44,205 - MainThread - zhmcclient.api - DEBUG - Called: NotificationReceiver.close(), args: (<zhmcclient._notification.NotificationReceiver object at 0x10854e710>,), kwargs: {}
2019-05-12 11:33:44,205 - MainThread - stomp.py - DEBUG - Sending frame: ['DISCONNECT', '\n', 'receipt:ed61c0ed-8c08-47c8-b1b9-93850e64124b\n', '\n', '\x00']
2019-05-12 11:33:44,317 - StompReceiverThread-1 - stomp.py - DEBUG - Received frame: 'RECEIPT', headers={'receipt-id': 'ed61c0ed-8c08-47c8-b1b9-93850e64124b'}, body=''
2019-05-12 11:33:44,321 - StompReceiverThread-1 - stomp.py - DEBUG - listener <zhmcclient._notification._NotificationListener object at 0x1085db950> has no method on_receipt
2019-05-12 11:33:44,321 - StompReceiverThread-1 - stomp.py - INFO - Receiver loop ended
2019-05-12 11:33:44,321 - StompReceiverThread-1 - stomp.py - DEBUG - listener <zhmcclient._notification._NotificationListener object at 0x1085db950> has no method on_receiver_loop_completed
2019-05-12 11:33:44,321 - MainThread - zhmcclient.api - DEBUG - Return: NotificationReceiver.close(), result: None
Logging off...
2019-05-12 11:33:44,322 - MainThread - zhmcclient.api - DEBUG - Called: Session.logoff(), args: (Session at 0x10854e690 ( _host='172.16.64.15', _userid='zbcInstall', _password='...', _get_password=None, _retry_timeout_config=<zhmcclient._session.RetryTimeoutConfig object at 0x10854e610>, _base_url='https://172.16.64.15:6794', _headers={'X-API-Session': u'209lbrloj0800t22d7h7f4998wc1l9qm1xfter51sudlrpaufq', 'Content-type': 'application/json', 'Accept': '*/*'}, _session_id=u'209lbrloj0800t22d7h7f4998wc1l9qm1xfter51sudlrpaufq', _session=<requests.sessions.Session object at 0x10854eb90> ), Fal, kwargs: {}
2019-05-12 11:33:44,322 - MainThread - zhmcclient.hmc - DEBUG - Request: DELETE https://172.16.64.15:6794/api/sessions/this-session, headers: {'X-API-Session': '********', 'Content-type': 'application/json', 'Accept': '*/*'}, content(max.1000): None
2019-05-12 11:33:44,443 - MainThread - urllib3.connectionpool - DEBUG - https://172.16.64.15:6794 "DELETE /api/sessions/this-session HTTP/1.1" 204 0
2019-05-12 11:33:44,444 - MainThread - zhmcclient.hmc - DEBUG - Respons: DELETE https://172.16.64.15:6794/api/sessions/this-session, status: 204, headers: {'x-request-id': 'Sxf98eefd8-556e-11e9-95c9-00106f23a0f1.36b Rx4', 'X-XSS-Protection': '1; mode=block', 'X-Content-Type-Options': 'nosniff', 'Content-Security-Policy': "img-src * data:; frame-ancestors 'self'", 'Strict-Transport-Security': 'max-age=0', 'Server': 'Hardware management console API web server / 2.0', 'Cache-control': 'no-cache', 'Date': 'Sun, 12 May 2019 09:33:44 GMT', 'X-Frame-Options': 'SAMEORIGIN'}, content(max.1000): ''
2019-05-12 11:33:44,445 - MainThread - zhmcclient.api - DEBUG - Return: Session.logoff(), result: None
Done.

The main difference to your log is that it continues after the StompReceiverThread-1 thread logs the received frame 'RECEIPT'.

Can you please do two things:

andy-maier commented 5 years ago

See also this issue which was fixed in stomp.py 4.1.19: https://github.com/jasonrbriggs/stomp.py/issues/160. So if you have a stomp version lower than 4.1.19, please upgrade to 4.1.19 to verify whether the issue is addressed for you. In that case, we would increase the minimum version of stomp.py required by zhmcclient accordingly.

andy-maier commented 5 years ago

PR #578 contains the changes to the show_os_messages.py example program I mentioned above, and was now merged into master.

james-crowley commented 5 years ago

@andy-maier I am using Python 3. That might make a difference in your testing. My pip list is as follows:

Package    Version 
---------- --------
certifi    2019.3.9
chardet    3.0.4   
decorator  4.4.0   
docopt     0.6.2   
idna       2.8     
netifaces  0.10.9  
pbr        5.2.0   
pip        10.0.1  
pytz       2019.1  
PyYAML     3.13    
requests   2.21.0  
setuptools 39.1.0  
six        1.12.0  
stomp.py   4.1.22  
urllib3    1.24.3  
zhmcclient 0.23.0  

So it looks like I am using an updated version, 4.1.22, of stomp.py and I am using a updated version of zhmcclient, 0.23.0, then I previously listed.

I'll give show_os_messages.py a try and report back.

james-crowley commented 5 years ago

This is odd. I was able to close receiver with you code. I had to hit the keyboard interrupt and it seemed to close fine.

I decided to take my code, and add it to your test program, so I could make sure there was not another program with my program, throwing off this portion.

To add my code, I commented out your section of code, line 119 - 141, and added my own.

def listen_for_message(receiver, phrase):
    """
    Listens for a certain message on the OS system Messages
    :param receiver Receiver Object
    :param phrase: The phrase to listen for
    """
    try:
        # Listens for new message from the OS System Message on a LPAR and Loops over them
        for headers, message in receiver.notifications():
            os_msg_list = message['os-messages']
            for os_msg in os_msg_list:
                # Will grab a block of messages. I want to split them up line by line
                msg_txt = os_msg['message-text'].strip('\n')

                # Splits them up in a list of lines instead of a block of messages
                lines = msg_txt.split("\n")

                # Loops through each line
                for line in lines:
                    # If the match is found stop and return
                    if line == phrase:
                        logger.info("Found a messages matching the phrase, {PHRASE}".format(PHRASE=phrase))
                        # Closes the receiver, Disabled right now due to bug
                        # receiver.close()
                        return
    except:
        # Not sure what to catch, since class is still in beta
        logger.error("Notification Receiver is a experimental class, an error was thrown")
        raise

It is a function that searches for a certain message in the os system messages. Once found, it should return. Ideally, I want to close the receiver before I return. I realize that might be the issue, so I removed the receiver.close() to be after the function returns.

Here is what the code now looks like,

receiver = zhmcclient.NotificationReceiver(topic, hmc, userid, password)
print("Showing OS messages (including refresh messages) ...")
sys.stdout.flush()

logger.info("Entering the receiver....")
listen_for_message(receiver=receiver, phrase="TEST")
logger.info("Done with the receiver!")
# try:
#     for headers, message in receiver.notifications():
#         # print("# HMC notification #%s:" % headers['session-sequence-nr'])
#         # sys.stdout.flush()
#         os_msg_list = message['os-messages']
#         for os_msg in os_msg_list:
#             if PRINT_METADATA:
#                 msg_id = os_msg['message-id']
#                 held = os_msg['is-held']
#                 priority = os_msg['is-priority']
#                 prompt = os_msg.get('prompt-text', None)
#                 print("# OS message %s (held: %s, priority: %s, prompt: %r):" %
#                       (msg_id, held, priority, prompt))
#             msg_txt = os_msg['message-text'].strip('\n')
#             print(msg_txt)
#             sys.stdout.flush()
# except KeyboardInterrupt:
#     print("Keyboard interrupt - leaving receiver loop")
#     sys.stdout.flush()
# finally:
#     print("Closing receiver...")
#     sys.stdout.flush()
#     receiver.close()

logger.info("closing....")
receiver.close()
print("Logging off...")
sys.stdout.flush()
session.logoff()

if timestats:
    print(session.time_stats_keeper)

print("Done.")

As you can see I should see Entering the receiver.... right before I call my function. And should see Done with the receiver! after it returns. The finally I should see, closing...., in the log right before it should close the receiver.

But in the logs you can see the closing......, then the call to close the receiver, and then it hangs.

2019-05-13 09:48:49,881 - MainThread - zhmcclient.hmc - DEBUG - HMC request: POST https://IP:6794/api/logical-partitions/e12121da-c503-3117-8e79-c46bd3531b90/operations/open-os-message-channel, headers: {'Content-type': 'application/json', 'Accept': '*/*', 'X-API-Session': 'abw2u6it3yo8px4idvu0hjhq5bx2qoe54wbnol8aozjlc8x0o'}, content(max.1000): '{"include-refresh-messages": true}'
2019-05-13 09:48:49,913 - MainThread - urllib3.connectionpool - DEBUG - https://IP:6794 "POST /api/logical-partitions/e12121da-c503-3117-8e79-c46bd3531b90/operations/open-os-message-channel HTTP/1.1" 200 64
2019-05-13 09:48:49,914 - MainThread - zhmcclient.hmc - DEBUG - HMC response: POST https://IP:6794/api/logical-partitions/e12121da-c503-3117-8e79-c46bd3531b90/operations/open-os-message-channel, status: 200, headers: {'Strict-Transport-Security': 'max-age=0', 'X-Frame-Options': 'SAMEORIGIN', 'X-Content-Type-Options': 'nosniff', 'X-XSS-Protection': '1; mode=block', 'Content-Security-Policy': "img-src * data:; frame-ancestors 'self'", 'x-request-id': 'Sx5279bbb4-5fb5-11e9-8589-00106f240bf4.a3 Rx3', 'Cache-control': 'no-cache', 'x-wsa-provider-duration-ms': '18', 'Content-Type': 'application/json;charset=UTF-8', 'Content-Length': '64', 'Date': 'Mon, 13 May 2019 13:49:37 GMT', 'Server': 'Hardware management console API web server / 2.0'}, content(max.1000): b'{"topic-name":"sysprog.163osm.e12121dac50331178e79c46bd3531b90"}'
2019-05-13 09:48:49,914 - MainThread - zhmcclient.api - DEBUG - <== Lpar.open_os_message_channel(), result: 'sysprog.163osm.e12121dac50331178e79c46bd3531b90'
2019-05-13 09:48:49,914 - MainThread - stomp.py - INFO - Attempting connection to host IP, port 61612
2019-05-13 09:48:49,948 - MainThread - stomp.py - INFO - Established connection to host IP, port 61612
2019-05-13 09:48:49,948 - Thread-1 - stomp.py - INFO - Starting receiver loop
2019-05-13 09:48:49,949 - MainThread - stomp.py - DEBUG - listener <zhmcclient._notification._NotificationListener object at 0x10cfd6da0> has no method on_connecting
2019-05-13 09:48:49,949 - MainThread - stomp.py - DEBUG - Sending frame: [b'STOMP', b'\n', b'accept-version:1.1\n', b'login:TEST\n', b'passcode:TEST\n', b'\n', b'\x00']
2019-05-13 09:48:49,988 - StompReceiverThread-1 - stomp.py - DEBUG - Received frame: 'CONNECTED', headers={'server': 'ActiveMQ/${project.version}', 'heart-beat': '0,0', 'session': 'ID:PETHMC1-34035-1555356910980-5:106', 'version': '1.1'}, body=''
2019-05-13 09:48:49,989 - StompReceiverThread-1 - stomp.py - DEBUG - listener <zhmcclient._notification._NotificationListener object at 0x10cfd6da0> has no method on_connected
2019-05-13 09:48:49,989 - MainThread - stomp.py - DEBUG - Sending frame: [b'SUBSCRIBE', b'\n', b'ack:auto\n', b'destination:/topic/sysprog.163osm.e12121dac50331178e79c46bd3531b90\n', b'id:zhmcclient.4513457376\n', b'\n', b'\x00']
2019-05-13 09:48:49,989 - MainThread - root - INFO - Entering the receiver....
2019-05-13 09:48:49,990 - MainThread - zhmcclient.api - DEBUG - ==> NotificationReceiver.notifications(), args: (<zhmcclient._notification.NotificationReceiver object at 0x10d05e4e0>,), kwargs: {}
2019-05-13 09:48:49,990 - MainThread - zhmcclient.api - DEBUG - <== NotificationReceiver.notifications(), result: <generator object notifications at 0x10cfeaa98>
2019-05-13 09:48:50,060 - StompReceiverThread-1 - stomp.py - DEBUG - listener <zhmcclient._notification._NotificationListener object at 0x10cfd6da0> has no method on_before_message
2019-05-13 09:48:50,060 - StompReceiverThread-1 - stomp.py - DEBUG - Received frame: 'MESSAGE', headers={'expires': '0', 'global-sequence-nr': '6384', 'session-sequence-nr': '0', 'destination': '/topic/sysprog.163osm.e12121dac50331178e79c46bd3531b90', 'subscription': 'zhmcclient.4513457376', 'priority': '4', 'notification-type': 'os-message', 'name': 'LP41', 'message-id': 'ID:PETHMC1-42915-1555356921407-5:1:1:65:1', 'object-uri': '/api/logical-partitions/e12121da-c503-3117-8e79-c46bd3531b90', 'object-id': 'e12121da-c503-3117-8e79-c46bd3531b90', 'class': 'logical-partition', 'timestamp': '1557755378078'}, body='{"os-messages":[{"message-text":"sed: can\'t read /etc/issue: No such file or directory\\n","message-id":"5147114","sequence-number":1272,"os-name":null,"sound-alarm":false,"is-priority":false,"is-refresh":true,"prompt-text":"","is-held":false,"timestamp":-1},{"message-text":"System version .\\n","message-id":"5147115","sequence-number":1273,"os-name":null,"sound-alarm":false,"is-priority":false,"is-refresh":true,"prompt-text":"","is-held":false,"timestamp":-1},{"message-text":"BL2 version 2.4.1 (private) start (Mon May 13 13:48:34 UTC 2019)\\n","message-id":"5147116","sequence-number":1543,"os-name":null,"sound-alarm":false,"is-priority":false,"is-refresh":true,"prompt-text":"","is-held":false,"timestamp":-1},{"message-text":"Booting appliance from \'0.0.7800\' ...\\n","message-id":"5147117","sequence-number":1554,"os-name":null,"sound-alarm":false,"is-priority":false,"is-refresh":true,"prompt-text":"","is-held":false,"timestamp":-1},{"message-text":"BL2 stage 1 complete.\\n","message-id":"5147118","sequence-number":1555,"os-name":null,"sound-alarm":false,"is-priority":false,"is-refresh":true,"prompt-text":"","is-held":false,"timestamp":-1},{"message-text":"BL2 stage 2 complete.\\n","message-id":"5147119","sequence-number":1556,"os-name":null,"sound-alarm":false,"is-priority":false,"is-refresh":true,"prompt-text":"","is-held":false,"timestamp":-1},{"message-text":"BL2 stage 3 complete.\\n","message-id":"5147128","sequence-number":1557,"os-name":null,"sound-alarm":false,"is-priority":false,"is-refresh":true,"prompt-text":"","is-held":false,"timestamp":-1},{"message-text":"BL2 stage 4 complete.\\n","message-id":"5147129","sequence-number":1558,"os-name":null,"sound-alarm":false,"is-priority":false,"is-refresh":true,"prompt-text":"","is-held":false,"timestamp":-1},{"message-text":"BL2 stage 5 complete.\\n","message-id":"5147130","sequence-number":1559,"os-name":null,"sound-alarm":false,"is-priority":false,"is-refresh":true,"prompt-text":"","is-held":false,"timestamp":-1},{"message-text":"Switching to appliance.\\n","message-id":"5147185","sequence-number":1560,"os-name":null,"sound-alarm":false,"is-priority":false,"is-refresh":true,"prompt-text":"","is-held":false,"timestamp":-1},{"message-text":"Running \'zaas\' level \'HEAD.private./var/lib/jenkins/workspace/zPOKProduction_1.0\\n","message-id":"5147186","sequence-number":1561,"os-name":null,"sound-alarm":false,"is-priority":false,"is-refresh":true,"prompt-text":"","is-held":false,"timestamp":-1},{"message-text":"_Build_Release_Image-Release_Branches/scaas/appliance/ built on Mon Dec 17 14:32\\n","message-id":"5147187","sequence-number":1562,"os-name":null,"sound-alarm":false,"is-priority":false,"is-refresh":true,"prompt-text":"","is-held":false,"timestamp":-1},{"message-text":":27 2018 UTC\'.\\n","message-id":"5147188","sequence-number":1563,"os-name":null,"sound-alarm":false,"is-priority":false,"is-refresh":true,"prompt-text":"","is-held":false,"timestamp":-1}]}'
2019-05-13 09:49:10,294 - StompReceiverThread-1 - stomp.py - DEBUG - listener <zhmcclient._notification._NotificationListener object at 0x10cfd6da0> has no method on_before_message
2019-05-13 09:49:10,294 - StompReceiverThread-1 - stomp.py - DEBUG - Received frame: 'MESSAGE', headers={'expires': '0', 'global-sequence-nr': '6399', 'session-sequence-nr': '1', 'destination': '/topic/sysprog.163osm.e12121dac50331178e79c46bd3531b90', 'subscription': 'zhmcclient.4513457376', 'priority': '4', 'notification-type': 'os-message', 'name': 'LP41', 'message-id': 'ID:PETHMC1-42915-1555356921407-5:1:1:65:2', 'object-uri': '/api/logical-partitions/e12121da-c503-3117-8e79-c46bd3531b90', 'object-id': 'e12121da-c503-3117-8e79-c46bd3531b90', 'class': 'logical-partition', 'timestamp': '1557755398299'}, body='{"os-messages":[{"message-text":" \\n","message-id":"0","sequence-number":1564,"os-name":null,"sound-alarm":false,"is-priority":false,"is-refresh":false,"prompt-text":"","is-held":false,"timestamp":-1},{"message-text":"Please connect to the IBM web UI via your browser\\n","message-id":"0","sequence-number":1565,"os-name":null,"sound-alarm":false,"is-priority":false,"is-refresh":false,"prompt-text":"","is-held":false,"timestamp":-1},{"message-text":"The server is listening on: 10.20.81.73, fd55:faaf:e1ab:a45:f95b:66ce:56d3:e4,\\n","message-id":"0","sequence-number":1566,"os-name":null,"sound-alarm":false,"is-priority":false,"is-refresh":false,"prompt-text":"","is-held":false,"timestamp":-1},{"message-text":"and fd55:faaf:e1ab:a45:0:ff:fefe:a34\\n","message-id":"0","sequence-number":1567,"os-name":null,"sound-alarm":false,"is-priority":false,"is-refresh":false,"prompt-text":"","is-held":false,"timestamp":-1},{"message-text":" \\n","message-id":"0","sequence-number":1568,"os-name":null,"sound-alarm":false,"is-priority":false,"is-refresh":false,"prompt-text":"","is-held":false,"timestamp":-1},{"message-text":" \\n","message-id":"0","sequence-number":1569,"os-name":null,"sound-alarm":false,"is-priority":false,"is-refresh":false,"prompt-text":"","is-held":false,"timestamp":-1},{"message-text":"Initial Network Interface Summary:\\n","message-id":"0","sequence-number":1570,"os-name":null,"sound-alarm":false,"is-priority":false,"is-refresh":false,"prompt-text":"","is-held":false,"timestamp":-1}]}'
2019-05-13 09:49:10,294 - MainThread - root - INFO - Found a messages matching the phrase, Please connect to the IBM web UI via your browser
2019-05-13 09:49:10,295 - MainThread - root - INFO - Done with the receiver!
2019-05-13 09:49:10,295 - MainThread - root - INFO - closing....
2019-05-13 09:49:10,295 - MainThread - zhmcclient.api - DEBUG - ==> NotificationReceiver.close(), args: (<zhmcclient._notification.NotificationReceiver object at 0x10d05e4e0>,), kwargs: {}
2019-05-13 09:49:10,295 - MainThread - stomp.py - DEBUG - Sending frame: [b'DISCONNECT', b'\n', b'receipt:a51d0982-72bb-481f-83c7-c9a5542fc7ef\n', b'\n', b'\x00']
2019-05-13 09:49:10,299 - StompReceiverThread-1 - stomp.py - DEBUG - Received frame: 'RECEIPT', headers={'receipt-id': 'a51d0982-72bb-481f-83c7-c9a5542fc7ef'}, body=''
andy-maier commented 5 years ago

Communicating with Jim internally...

fbuehrer commented 5 years ago

Hi @andy-maier Has there been any follow-up on this? I am seeing the same behaviour as Jim, but tried it in both a python 3.7 and python 2.7 environment. I made sure that stomp-py is > 4.1.19. So I was wondering if you managed to figure out what the problem was. I am pretty much trying to do the same thing as Jim was. I.e. Looping through the operator-messages, checking for a specific message and triggering an action. But calling receiver.close() after that seems to hang indefinitely. Some insight is very much appreciated.

Greetings from Building 14 ;-)

andy-maier commented 4 years ago

@fbuehrer Hi Felix. Greetings back from building 4 :-) Sorry this took a while to respond. I have not figured out what is happening. I have put this on for the next version, and maybe we can get together to figure out what is going on.

andy-maier commented 4 years ago

Just for info: The minimum version of stomp.py has been increased to 4.1.21 in PR #610 (targeted for zhmcclient 0.26.0)

andy-maier commented 4 years ago

More info: The HMC WS API book for z15 states that the HMC uses Apache MQ 5.2.0.

The HMC WS API book does not detail which version of the stomp protocol is supported. The zhmcclient code currently uses the default protocol version provided by stomp.Connection which is stomp 1.1.

The stomp.Connection init supports a number of arguments related to timeout etc, none of which is specified currently by the zhmcclient code:

andy-maier commented 4 years ago

@fbuehrer @james-crowley I looked at the stomp.py source code and in the disconnect() method there is not much debug code. The only place that could possibly hang because it waits for a condition should not be called according to my analysis. So I think I will create a fork of the stomp.py repo, add some debug code and will ask you to install stomp from that fork and enable debug logging.

Stay tuned.

andy-maier commented 4 years ago

I have created a fork of stomp.py 4.1.22 and have added some debug code to its connect/disconnect code. Also, I suggest that we first track down the issue using the show_os_messages.py example in order for all of us to use the same code.

This is the entire procedure for reproducing it:

  1. Have the python environment active where you run zhmcclient
  2. Have the zhmcclient repo cloned and its master branch installed
  3. Install stomp.py from my fork:
    $ pip install --upgrade git+https://github.com/andy-maier/stomp.py.git@andy/debug-disconnect#egg=stomp.py

    The stomp.py package should now be shown with version 4.1.22.post1.

  4. Configure debug logging for the show_os_messages example in the config file for the zhmcclient examples (e.g. examples/hmccreds.yml):
    examples:
       . . .
       show_os_messages:
          hmc: "172.16.64.15"
          cpcname: CPCA
          partname: MGMT1
          loglevel: debug   # enable logging and set to debug level
          logmodule: ""     # for the Python root logger (i.e. all loggers)
          timestats: no
  5. From the zhmcclient repo's work directory, run its show_os_message.py example:
    $ examples/show_os_messages.py examples/hmccreds.yml

    Once it has displayed the OS messages, terminate it with Ctrl-C and report the debug log messages after that.

In my case, it terminates correctly and the log messages after terminating it are:

^CKeyboard interrupt - leaving receiver loop
Closing receiver...
2020-01-22 11:17:44,516 - MainThread - zhmcclient.api - DEBUG - Called: NotificationReceiver.close(), args: (<zhmcclient._notification.NotificationReceiver object at 0x110073410>,), kwargs: {}
2020-01-22 11:17:44,516 - MainThread - stomp.py - DEBUG - Disconnecting from server
2020-01-22 11:17:44,516 - MainThread - stomp.py - DEBUG - Sending frame: ['DISCONNECT', '\n', 'receipt:3f9965ec-a15d-419c-9468-3d05168a1fe0\n', '\n', '\x00']
2020-01-22 11:17:44,516 - MainThread - stomp.py - DEBUG - Waiting for receiver thread to exit (no timeout)
2020-01-22 11:17:44,629 - StompReceiverThread-1 - stomp.py - DEBUG - Received frame: 'RECEIPT', headers={'receipt-id': '3f9965ec-a15d-419c-9468-3d05168a1fe0'}, body=''
2020-01-22 11:17:44,629 - StompReceiverThread-1 - stomp.py - DEBUG - Received receipt with disconnect command
2020-01-22 11:17:44,629 - StompReceiverThread-1 - stomp.py - DEBUG - Issuing socket.unwrap() to close SSL socket
2020-01-22 11:17:44,633 - StompReceiverThread-1 - stomp.py - DEBUG - Issuing socket.close() to close socket
2020-01-22 11:17:44,633 - StompReceiverThread-1 - stomp.py - DEBUG - listener <zhmcclient._notification._NotificationListener object at 0x1101bf590> has no method on_receipt
2020-01-22 11:17:44,633 - StompReceiverThread-1 - stomp.py - INFO - Receiver loop ended
2020-01-22 11:17:44,633 - StompReceiverThread-1 - stomp.py - DEBUG - listener <zhmcclient._notification._NotificationListener object at 0x1101bf590> has no method on_receiver_loop_completed
2020-01-22 11:17:44,633 - MainThread - stomp.py - DEBUG - Receiver thread exited
2020-01-22 11:17:44,634 - MainThread - zhmcclient.api - DEBUG - Return: NotificationReceiver.close(), result: None
Logging off...
2020-01-22 11:17:44,634 - MainThread - zhmcclient.api - DEBUG - Called: Session.logoff(), args: (Session at 0x10fff1a50 ( _host='172.16.64.15', _userid='zbcInstall', _password='...', _get_password=None, _retry_timeout_config=<zhmcclient._session.RetryTimeoutConfig object at 0x10fff19d0>, _base_url='https://172.16.64.15:6794', _headers={'X-API-Session': '********', 'Content-type': 'application/json', 'Accept': '*/*'}, _session_id='********', _session=<requests.sessions.Session object at 0x10fff1fd0> ), False), kwargs: {}
2020-01-22 11:17:44,634 - MainThread - zhmcclient.hmc - DEBUG - Request: DELETE https://172.16.64.15:6794/api/sessions/this-session, headers: {'X-API-Session': '********', 'Content-type': 'application/json', 'Accept': '*/*'}, content(max.1000): None
2020-01-22 11:17:44,750 - MainThread - urllib3.connectionpool - DEBUG - https://172.16.64.15:6794 "DELETE /api/sessions/this-session HTTP/1.1" 204 0
2020-01-22 11:17:44,752 - MainThread - zhmcclient.hmc - DEBUG - Respons: DELETE https://172.16.64.15:6794/api/sessions/this-session, status: 204, headers: {'x-request-id': 'Sx83eace1c-3b66-11ea-930b-6923c3006ef5.ac Rx4', 'X-XSS-Protection': '1; mode=block', 'X-Content-Type-Options': 'nosniff', 'Content-Security-Policy': "img-src * data:; frame-ancestors 'self'", 'Strict-Transport-Security': 'max-age=0', 'Server': 'Hardware management console API web server / 2.0', 'Cache-control': 'no-cache', 'Date': 'Wed, 22 Jan 2020 10:17:44 GMT', 'X-Frame-Options': 'SAMEORIGIN'}, content(max.1000): ''
2020-01-22 11:17:44,754 - MainThread - zhmcclient.api - DEBUG - Return: Session.logoff(), result: None
Done.

Depending on how your log looks like, I may need to add more debug code.

fbuehrer commented 4 years ago

Hi Andreas,

thank you very much for looking into this. I actually don't have any trouble running the show_os_commands.py example. The problem I have is when trying to exit the loop over the notifications without raising a KeyboardInterrupt. E.g. Filtering on a specific message id and leaving the loop. I installed your patched version of stomp in my environment. One gotcha here: installing pyyaml via pip installed a newer version of stomp for me that I manually had to remove since the names of the packages were different (stomp.py vs stomp-py) and the patched module wasn't satisfying the pyyaml-requirements (and importing stomp imported v. 5..0.1 in that case).

I edited the show_os_commands example to show what I mean:

This is the modified try/except block. I just used any message ID that I found in the notification stack:

try:
    for headers, message in receiver.notifications():
        # print("# HMC notification #%s:" % headers['session-sequence-nr'])
        # sys.stdout.flush()
        os_msg_list = message['os-messages']
        for os_msg in os_msg_list:
            if PRINT_METADATA:
                msg_id = os_msg['message-id']
                held = os_msg['is-held']
                priority = os_msg['is-priority']
                prompt = os_msg.get('prompt-text', None)
                print("# OS message %s (held: %s, priority: %s, prompt: %r):" %
                      (msg_id, held, priority, prompt))
            msg_txt = os_msg['message-text'].strip('\n')
            print(msg_txt)
            sys.stdout.flush()
            if msg_id == 2010187:
                raise NameError
except KeyboardInterrupt:
    print("Keyboard interrupt - leaving receiver loop")
    sys.stdout.flush()
except NameError:
    print("Message found - leaving receiver loop")
    sys.stdout.flush()
finally:
    print("Closing receiver...")
    sys.stdout.flush()
    receiver.close()

And this is the debug output:

Message found - leaving receiver loop
Closing receiver...
2020-01-22 21:30:31,379 - MainThread - zhmcclient.api - DEBUG - Called: NotificationReceiver.close(), args: (<zhmcclient._notification.NotificationReceiver object at 0x1023fcd90>,), kwargs: {}
2020-01-22 21:30:31,379 - MainThread - stomp.py - DEBUG - Disconnecting from server
2020-01-22 21:30:31,379 - MainThread - stomp.py - DEBUG - Sending frame: [b'DISCONNECT', b'\n', b'receipt:d772b7ba-6847-4054-9fe5-a85531dedd58\n', b'\n', b'\x00']
2020-01-22 21:30:31,380 - MainThread - stomp.py - DEBUG - Waiting for receiver thread to exit (no timeout)
2020-01-22 21:30:31,443 - StompReceiverThread-1 - stomp.py - DEBUG - Received frame: 'RECEIPT', headers={'receipt-id': 'd772b7ba-6847-4054-9fe5-a85531dedd58'}, body=''
2020-01-22 21:30:31,443 - StompReceiverThread-1 - stomp.py - DEBUG - Received receipt with disconnect command
2020-01-22 21:30:31,443 - StompReceiverThread-1 - stomp.py - DEBUG - Issuing socket.unwrap() to close SSL socket
2020-01-22 21:30:31,443 - StompReceiverThread-1 - stomp.py - DEBUG - Issuing socket.close() to close socket

which is where the program flow stops and can only be interrupted with a KeayboardInterrupt:

  File "show_os_messages.py", line 149, in <module>
    receiver.close()
  File "</Users/felix/.local/share/virtualenvs/AVTestCLI-U8H9kL1M/lib/python3.7/site-packages/decorator.py:decorator-gen-170>", line 2, in close
  File "/Users/felix/.local/share/virtualenvs/AVTestCLI-U8H9kL1M/src/zhmcclient/zhmcclient/_logging.py", line 223, in log_api_call
    result = func(*args, **kwargs)
  File "/Users/felix/.local/share/virtualenvs/AVTestCLI-U8H9kL1M/src/zhmcclient/zhmcclient/_notification.py", line 248, in close
    self._conn.disconnect()
  File "/Users/felix/.local/share/virtualenvs/AVTestCLI-U8H9kL1M/src/stomp-py/stomp/connect.py", line 185, in disconnect
    self.transport.stop()
  File "/Users/felix/.local/share/virtualenvs/AVTestCLI-U8H9kL1M/src/stomp-py/stomp/transport.py", line 123, in stop
    self.__receiver_thread_exit_condition.wait()
  File "/usr/local/Cellar/python/3.7.6_1/Frameworks/Python.framework/Versions/3.7/lib/python3.7/threading.py", line 296, in wait
    waiter.acquire()
KeyboardInterrupt

I'm happy to debug this further if you have any idea what to try.

andy-maier commented 4 years ago

@fbuehrer So the main thread is waiting for notification from the receiver thread which does not happen.

Looking into the code, I have a hard time figuring out how the normal exit would happen. I therefore added some more debug log statements to the stomp branch in my fork (now showing version 4.1.22.post2). Please reinstall stomp.py from the fork (same branch), and run your modified script again that produces the issue.

fbuehrer commented 4 years ago

@andy-maier I don't get any additional debug output with the updated stomp module (I double-checked with stomp.version and verified that it reports as (4, 1, 22, 'post2') ).

Still stalls at

2020-01-23 12:20:59,344 - StompReceiverThread-1 - stomp.py - DEBUG - Issuing socket.close() to close socket
fbuehrer commented 4 years ago

Hi @andy-maier

by accident I was running the script with stomp 5.0.1 (which was released just a couple of weeks ago) and the disconnect properly works. I double checked and switched back to stomp 4.1.22 and the error is back. So it seems that going from stomp 4 to stomp 5 actually fixes the Problem. Maybe @james-crowley can check if he is seeing the same behaviour?

andy-maier commented 4 years ago

The issue with stomp 5 is that the package owner removed python 2 support. But very good to know, of course.

andy-maier commented 4 years ago

I have now added support to the show_os_messages.py example to break upon receipt of a configurable message ID, and was able to reproduce the hang upon close(). That is in PR #615.

I also picked up a few lines of code from stomp PR https://github.com/jasonrbriggs/stomp.py/pull/266 (which went into stomp.py version 5.0.0) that fix a double disconnect notification issue, and with that fix, the example no longer hangs. That updated stomp code is now on my fork of stomp, as version 4.1.22.post3.

@fbuehrer @james-crowley Can you please validate whether that fix works for you?

fbuehrer commented 4 years ago

Hi @andy-maier. I can confirm that the problem does not persist anymore using stomp.py=4.1.22.post3.

andy-maier commented 4 years ago

@fbuehrer Hi Felix, that sounds very good to me. I'll work on getting this fix then into stomp 4.1.

andy-maier commented 4 years ago

I created issue https://github.com/jasonrbriggs/stomp.py/issues/284 and provided a branch with the fix for 4.1.22 in my stomp.py fork.

andy-maier commented 4 years ago

Jason picked up the fix and released a stomp.py version 4.1.23. I just verified that the fix resolves the hang issue in the show_os_messages.py example when leaving the loop based on message ID.

I'll increase the minimum version of stomp.py in zhmcclient accordingly.

andy-maier commented 4 years ago

PR #616 increases the minimum version of stomp.py to 4.1.23. It will be released soon as part of zhmcclient 0.26.0.