ibm-messaging / mq-golang

Calling IBM MQ from Go applications
Apache License 2.0
168 stars 60 forks source link

Getting {signal SIGSEGV: segmentation violation code=0x1 addr=0x40 pc=0x7fe2df901be7] when run from container image in Kube onPrem. #166

Closed afernandezod closed 3 years ago

afernandezod commented 3 years ago

Getting the above error when calling MQManager, err = ibmmq.Connx(qMgrName, cno) . It is a TLS connection and already verify all parameters for cno,csp,sco and cd are correct. Only happening when trying to run the service in a Docker container in kube. When run in Local Go or local from the container runs fine. Also firewall restrictions checked and looks fine.

Actual code:

// MQconnect - to connect to MQ func MQconnect(conf config.ConfigInterface) (bool, ibmmq.MQQueueManager) {

var qMgrName string
resp := true

// Allocate the MQCNO structure needed for the CONNX call.
cno := ibmmq.NewMQCNO()
cd := ibmmq.NewMQCD()
csp := ibmmq.NewMQCSP()
sco := ibmmq.NewMQSCO()

qMgrName = conf.GetString("MQ.MngrName")

cd.ChannelName = conf.GetString("MQ.channel")
cd.ConnectionName = conf.GetString("MQ.host")
cd.SSLCipherSpec = conf.GetString("MQ.cipher")
cd.SSLClientAuth = ibmmq.MQSCA_OPTIONAL
sco.KeyRepository = conf.GetString("MQ.keyrepository")
sco.CertificateLabel = conf.GetString("MQ.certificateLabel")
cd.CertificateLabel = conf.GetString("MQ.certificateLabel")

csp.AuthenticationType = ibmmq.MQCSP_AUTH_USER_ID_AND_PWD
csp.UserId = conf.GetString("MQ.user")
csp.Password = conf.GetString("MQ.pass")

mqName := conf.GetString("MQ.name")
MQname = mqName

// Make the CNO refer to the CSP, CD and SCO structure so it gets used during the connection
cno.SecurityParms = csp
cno.ClientConn = cd
cno.SSLConfig = sco

// Indicate that we definitely want to use the client connection method.
cno.Options = ibmmq.MQCNO_CLIENT_BINDING

log.Info(cno)
log.Info(cd)
log.Info(csp)
log.Info(sco)
MQManager, err = ibmmq.Connx(qMgrName, cno)

if err == nil {
    resp = true
} else {
    resp = false
}

return resp, MQManager

}

Docker File entries (to install and use mq):

copy/unpack mq client

RUN mkdir -p /opt/mqm COPY 9.1.0.7-IBM-MQC-Redist-LinuxX64.tar.gz /opt/mqm/ RUN cd /opt/mqm \ && tar -xvf ./.tar.gz \ && rm -f ./.tar.gz \ && chmod a+rx /opt/mqm

COPY --from=builder /opt/mqm /opt/mqm RUN chmod -R a+rx /opt/mqm \ && ls -lta /opt/mqm/

Any help will be greatly appreciated!!!!

ibmmqmet commented 3 years ago

Difficult to know for sure.

But one problem that has been seen with some container configurations, particularly running under restricted security configurations in openshift, is that the MQ client can't access the directory it needs to log errors. It's normally under $HOME but that might not be set or available for some containers. See this PR which made a change for those progams to explicitly create and set permissions on a directory. Or use the MQ_OVERRIDE_DATA_PATH env var as documented here.

I wouldn't expect that failure to access the log directories to cause a SEGV in current versions of MQ, as we fixed a problem in that area a few years ago. You should just get a more normal MQRC failure code. But if you're using a 9.1 fixpack level, then it's possible that change wasn't backported to the LTS version.

afernandezod commented 3 years ago

Thanks @ibmmqmet.

I think we passed the issue with segmentation violation by using a newer version of the tar.gz file, we were using 9.1.0.7 and now changed to 9.2.2.0.. In addition we added write permission to opt/mqm and IBM folders.

Now when run from KUBE-OnPrem we are getting "MQDISC: MQCC = MQCC_FAILED [2] MQRC = MQRC_CONNECTION_BROKEN [2009]" which I believe already existed before just was masked by the segmentation problem. Also and before the BROKEN connection issue we are getting error "AMQ6300E: Directory '//.mqm' could not be created: 'EACCES - Permission denied'." which I had the hope to be resolved by adding write capabilities.

At this point looks like more permission issues we can visualize within the docket container.

Here changes in the Docker file:

copy/unpack mq client

RUN mkdir -p /opt/mqm \ && mkdir -p /IBM/MQ/data/errors

"#" COPY 9.1.0.7-IBM-MQC-Redist-LinuxX64.tar.gz /opt/mqm/ COPY 9.2.2.0-IBM-MQC-Redist-LinuxX64.tar.gz /opt/mqm/ RUN cd /opt/mqm \ && tar -xvf ./.tar.gz \ && rm -f ./.tar.gz \ && bin/genmqpkg.sh -b /opt/mqm \ && chmod -R a+rwx /opt/mqm \ && chmod -R a+rwx /IBM

. . . COPY --from=builder /opt/mqm /opt/mqm COPY --from=builder /IBM /IBM RUN chmod -R a+rwx /opt/mqm \ && ls -lta /opt/mqm/ \ && chmod -R a+rwx /IBM \ && ls -lta /IBM/

I'm still investigating from sources like https://github.com/ibm-messaging/mq-container/issues/310 but again, any help will be greatly appreciated.

ibmmqmet commented 3 years ago

Did you look at the PR I linked to before? It has this diff:

 && mkdir -p /IBM/MQ/data/errors \
   && mkdir -p /.mqm \
   && chmod -R 777 /IBM \
   && chmod -R 777 /.mqm

There should be no reason to change any permissions on /opt/mqm - that's for code, not data.

afernandezod commented 3 years ago

Once more, thanks @ibmmqmet.

After reviewing the code and discuss some security issues by changing the chmod from a+rwx to 777 for the /.mqm and it worked.

Really appreciate it.