Closed adammoody closed 4 years ago
Thank you, we will be working on a patch and get back shortly.
Something we noticed is that the
We have a fix for this and are working to publish the changes shortly.
Adam if you check out the latest master copy of this PSM v10.3.58 your issue should be resolved. Can you let us know
While doing some memory debugging, I’m getting a buffer overrun hit in the psm2 library. It looks to be coming from a sscanf call on a 4KB buffer holding a string that apparently has no terminating NULL character.
The call stack is the following:
__psm2_ep_open() psmi_ep_open_device() hfi_get_port_gid()
The buffer overrun happens at a sscanf() call in opa/opa_service.c on the gid_str string:
https://github.com/intel/opa-psm2/blob/master/opa/opa_service.c#L578
The gid_str buffer is 4KB long (sysfs_page_size) and is allocated in hfi_sysfs_port_read() --> read_page() in opa_sysfs.c
https://github.com/intel/opa-psm2/blob/master/opa/opa_sysfs.c#L357
It fills the buffer with content from this function, which reads in up to 4KB of data from a file descriptor:
https://github.com/intel/opa-psm2/blob/master/opa/opa_sysfs.c#L416
That contains a string like the following:
“0123:0000:0000:0000:0000:4567:0123:0000\177\312 …”
The sscanf is trying to pick off the first 8 hex values separted by colons, but the string seems to have no NULL in the entire 4KB buffer, and sscanf overruns the end of the buffer. If I force a NULL in there, sscanf does not overrun the buffer.
One fix could be to terminate the buffer with a NULL.