ibm-messaging / mq-golang

Calling IBM MQ from Go applications
Apache License 2.0
168 stars 60 forks source link

Deadlock at mqmetric.collectQueueStatus() #170

Closed liurui-1 closed 3 years ago

liurui-1 commented 3 years ago

We have a question about the Golang sdk for IBM MQ ( https://github.com/ibm-messaging/mq-golang/tree/79e82b431c9febfc4791fb8b2b37f1c33dab017f ). Our agent is blocked at mqmetric.CollectQueueStatus() for days. Following is stacktrace:

goroutine 1202789 [syscall]:
github.ibm.com/Unified-Agent/ibmmq/vendor/github.com/ibm-messaging/mq-golang/v5/ibmmq._Cfunc_MQGET(0x687e000011, 0xc001052000, 0xc002b043f0, 0x2800, 0xc000c69000, 0xc002381834, 0xc002381830, 0xc0023817ec)
    _cgo_gotypes.go:1179 +0x4a
github.ibm.com/Unified-Agent/ibmmq/vendor/github.com/ibm-messaging/mq-golang/v5/ibmmq.MQObject.getInternal.func1(0xc000000068, 0xc000558c00, 0xc00217ede0, 0x14, 0xc001052000, 0xc002b043f0, 0x2800, 0xc000c69000, 0xc002381834, 0xc002381830, ...)
    github.ibm.com/Unified-Agent/ibmmq/vendor/github.com/ibm-messaging/mq-golang/v5/ibmmq/mqi.go:690 +0x130
github.ibm.com/Unified-Agent/ibmmq/vendor/github.com/ibm-messaging/mq-golang/v5/ibmmq.MQObject.getInternal(0xc000000068, 0xc000558c00, 0xc00217ede0, 0x14, 0xc0005ca640, 0xc00044b1f0, 0xc000c69000, 0x2800, 0x2800, 0xc000494200, ...)
    github.ibm.com/Unified-Agent/ibmmq/vendor/github.com/ibm-messaging/mq-golang/v5/ibmmq/mqi.go:690 +0x271
github.ibm.com/Unified-Agent/ibmmq/vendor/github.com/ibm-messaging/mq-golang/v5/ibmmq.MQObject.Get(...)
    github.ibm.com/Unified-Agent/ibmmq/vendor/github.com/ibm-messaging/mq-golang/v5/ibmmq/mqi.go:616
github.ibm.com/Unified-Agent/ibmmq/vendor/github.com/ibm-messaging/mq-golang/v5/mqmetric.statusGetReply(0xc000558c00, 0x0, 0x0, 0x0, 0x0, 0x2700, 0x7f8a136e0a20, 0xc001a49ce0)
    github.ibm.com/Unified-Agent/ibmmq/vendor/github.com/ibm-messaging/mq-golang/v5/mqmetric/status.go:206 +0xf2
github.ibm.com/Unified-Agent/ibmmq/vendor/github.com/ibm-messaging/mq-golang/v5/mqmetric.collectQueueStatus(0xc000558c00, 0x7f8a130e7009, 0x1, 0x1, 0xffffffffffffffff, 0x0)
    github.ibm.com/Unified-Agent/ibmmq/vendor/github.com/ibm-messaging/mq-golang/v5/mqmetric/queue.go:238 +0x37a
github.ibm.com/Unified-Agent/ibmmq/vendor/github.com/ibm-messaging/mq-golang/v5/mqmetric.CollectQueueStatus(0xc000558c00, 0x7f8a130e7009, 0x1, 0xc0008e4120, 0x0)
    github.ibm.com/Unified-Agent/ibmmq/vendor/github.com/ibm-messaging/mq-golang/v5/mqmetric/queue.go:189 +0x2d8
github.ibm.com/Unified-Agent/ibmmq.(*Ibmmq).GatherQueues(0xc0000bc410, 0xc000d8c720, 0xd, 0xc0008e4101, 0x1)
    github.ibm.com/Unified-Agent/ibmmq/ibmmq.go:584 +0x27c5
github.ibm.com/Unified-Agent/ibmmq.(*Ibmmq).Gather.func1(0xc0020ee8f0, 0xc0000bc410, 0xc000d8c720, 0xd)
    github.ibm.com/Unified-Agent/ibmmq/ibmmq.go:247 +0x3c5
created by github.ibm.com/Unified-Agent/ibmmq.(*Ibmmq).Gather
    github.ibm.com/Unified-Agent/ibmmq/ibmmq.go:232 +0x198

There was no any error messages from ibmmq during this deadlock. So we suspect that there was some special scenario causing MQ to return 2033 (MQRC_NO_MSG_AVAILABLE) repeatedly when collecting queue status. Please review the code at

GatherQueues() in github.com/ibm-messaging/mq-golang/v5/mqmetric/queue.go and statusGetReply() in github.com/ibm-messaging/mq-golang/v5/mqmetric/status.go

Do you think it can be the root cause of the deadlock? Even with the latest code of MQ Golang SDK, this code logic is same.

ibmmqmet commented 3 years ago
liurui-1 commented 3 years ago

Hi @ibmmqmet ,

In the following code, unless err == nil and cfh.Control == ibmmq.MQCFC_LAST , it will not return from the loop which can be deadlocked at some special scenarios. GatherQueues() in github.com/ibm-messaging/mq-golang/v5/mqmetric/queue.go and statusGetReply() in github.com/ibm-messaging/mq-golang/v5/mqmetric/status.go

We will upgrade to use new version SDK but we can see the above code logic is not updated.

ibmmqmet commented 3 years ago

should be fixed in current releases