Closed Greg091 closed 5 years ago
The first problem is that an ASSERT is triggered when alignment file isn't found. This should just fail the open/create instead. This is on us to fix.
The other problem is that a kernel interface disappeared in a minor version release. Now, the long-term solution, since we already have libndctl dependency, is to simply purge all manual sysfs parsing and instead retrieve that from ndctl. However, the bigger problem is that current versions of PMDK will be broken with newer kernels...
This is the function that fails: https://github.com/pmem/pmdk/blob/master/src/common/file_posix.c#L210
Which file disappeared, exactly? Is this related with the conversion from a bus to a class device model?
It does look like the bus model also changes the relative location of the align value:
In the class device case it's here: $(readlink -f /sys/dev/char/253\:1)/../../dax_region/align
In the bus case it's here: $(readlink -f /sys/dev/char/253\:1)/../dax_region/align
That said, this path referenced in the code is not the device-dax instance alignment:
/sys/dev/char/%u:%u/device/align
That's the parent NVDIMM configuration device that, for maximum confusion, also happens to be named "daxX.Y". The values are effectively the same, but in a future where not all device-dax instances are parented by NVDIMM devices that path is looking in the wrong location and is guaranteed to break.
I think the path forward is to teach this code to walk the hierarchy back to the dax_region. I'll send a patch.
Here is the proposed routine, it walks the device-path looking for 'dax_region':
int dax_align(char *daxpath, unsigned long *align)
{
char path[PATH_MAX];
struct stat st;
int rc;
if (!align)
return EINVAL;
rc = stat(daxpath, &st);
if (rc)
return errno;
sprintf(path, "/sys/dev/char/%u:%u", major(st.st_rdev), minor(st.st_rdev));
daxpath = realpath(path, NULL);
if (!daxpath)
return errno;
strncpy(path, daxpath, sizeof(path) - 1);
path[sizeof(path) - 1] = '\0';
while (path[0] == '/') {
char *pos = strrchr(path, '/');
char buf[40];
int fd, len;
if (!pos)
break;
*pos = '\0';
len = strlen(path);
snprintf(&path[len], sizeof(path) - len, "/dax_region/align");
fd = open(path, O_RDONLY);
*pos = '\0';
if (fd < 0)
continue;
rc = read(fd, buf, sizeof(buf));
close(fd);
if (rc < 0)
return errno;
*align = strtoul(buf, NULL, 0);
return 0;
}
return ENOENT;
}
Verified on: 1.6-114-g7a39dd66e
Environment Information
Please provide a reproduction of the bug:
A lot of PMDK tests fail for the same reason. Below I put an example:
How often bug is revealed: (always, often, rare):
always
Actual behavior:
As above.
Expected behavior:
Tests should pass.
Details
Additional information about Priority and Help Requested:
Are you willing to submit a pull request with a proposed change? (Yes, No)
Requested priority: (Showstopper, High, Medium, Low)