pmem / issues

Old issues repo for PMDK.
http://pmem.io
13 stars 7 forks source link

Tests fail because of can't a read alignment of devdax #1071

Closed Greg091 closed 5 years ago

Greg091 commented 5 years ago

Environment Information

Please provide a reproduction of the bug:

A lot of PMDK tests fail for the same reason. Below I put an example:

$ ./RUNTESTS obj_basic_integration/ -s TEST7
obj_basic_integration/TEST7: SETUP (check/pmem/debug)
obj_basic_integration/TEST7 crashed (signal 6). err7.log below.
err7.log below.

out7.log below.
obj_basic_integration/TEST7 out7.log obj_basic_integration/TEST7: START: obj_basic_integration
obj_basic_integration/TEST7 out7.log  ./obj_basic_integration /dev/dax1.5

pmem7.log below.
obj_basic_integration/TEST7 pmem7.log <libpmem>: <1> [out.c:236 out_init] pid 28580: program: /home/jenkins/greg/pmdk/src/test/obj_basic_integration/obj_basic_integration
obj_basic_integration/TEST7 pmem7.log <libpmem>: <1> [out.c:238 out_init] libpmem version 1.1
obj_basic_integration/TEST7 pmem7.log <libpmem>: <1> [out.c:242 out_init] src version: 1.6+git237.g3ca1cd956
obj_basic_integration/TEST7 pmem7.log <libpmem>: <1> [out.c:250 out_init] compiled with support for Valgrind pmemcheck
obj_basic_integration/TEST7 pmem7.log <libpmem>: <1> [out.c:255 out_init] compiled with support for Valgrind helgrind
obj_basic_integration/TEST7 pmem7.log <libpmem>: <1> [out.c:260 out_init] compiled with support for Valgrind memcheck
obj_basic_integration/TEST7 pmem7.log <libpmem>: <1> [out.c:265 out_init] compiled with support for Valgrind drd
obj_basic_integration/TEST7 pmem7.log <libpmem>: <1> [out.c:270 out_init] compiled with support for shutdown state
obj_basic_integration/TEST7 pmem7.log <libpmem>: <1> [out.c:275 out_init] compiled with libndctl 63+
obj_basic_integration/TEST7 pmem7.log <libpmem>: <3> [mmap.c:67 util_mmap_init] 
obj_basic_integration/TEST7 pmem7.log <libpmem>: <3> [libpmem.c:56 libpmem_init] 
obj_basic_integration/TEST7 pmem7.log <libpmem>: <3> [pmem.c:784 pmem_init] 
obj_basic_integration/TEST7 pmem7.log <libpmem>: <3> [init.c:419 pmem_init_funcs] 
obj_basic_integration/TEST7 pmem7.log <libpmem>: <3> [init.c:368 pmem_cpuinfo_to_funcs] 
obj_basic_integration/TEST7 pmem7.log <libpmem>: <3> [init.c:372 pmem_cpuinfo_to_funcs] clflush supported
obj_basic_integration/TEST7 pmem7.log <libpmem>: <3> [init.c:376 pmem_cpuinfo_to_funcs] clflushopt supported
obj_basic_integration/TEST7 pmem7.log <libpmem>: <3> [init.c:388 pmem_cpuinfo_to_funcs] clwb supported
obj_basic_integration/TEST7 pmem7.log <libpmem>: <3> [init.c:281 use_avx_memcpy_memset] avx supported
obj_basic_integration/TEST7 pmem7.log <libpmem>: <3> [init.c:285 use_avx_memcpy_memset] PMEM_AVX not set or not == 1
obj_basic_integration/TEST7 pmem7.log <libpmem>: <3> [init.c:325 use_avx512f_memcpy_memset] avx512f supported
obj_basic_integration/TEST7 pmem7.log <libpmem>: <3> [init.c:329 use_avx512f_memcpy_memset] PMEM_AVX512F not set or not == 1
obj_basic_integration/TEST7 pmem7.log <libpmem>: <3> [pmem.c:216 pmem_has_auto_flush] 
obj_basic_integration/TEST7 pmem7.log <libpmem>: <3> [os_auto_flush_linux.c:114 check_domain_in_region] region_path: /sys/bus/nd/devices/region0
obj_basic_integration/TEST7 pmem7.log <libpmem>: <3> [os_auto_flush_linux.c:59 check_cpu_cache] domain_path: /sys/bus/nd/devices/region0/persistence_domain
obj_basic_integration/TEST7 pmem7.log <libpmem>: <3> [init.c:472 pmem_init_funcs] Flushing CPU cache
obj_basic_integration/TEST7 pmem7.log <libpmem>: <3> [init.c:483 pmem_init_funcs] using clwb
obj_basic_integration/TEST7 pmem7.log <libpmem>: <3> [init.c:501 pmem_init_funcs] using movnt SSE2
obj_basic_integration/TEST7 pmem7.log <libpmem>: <3> [pmem_posix.c:107 pmem_os_init] 
obj_basic_integration/TEST7 pmem7.log <libpmem>: <3> [libpmem.c:69 libpmem_fini] 
obj_basic_integration/TEST7 pmem7.log <libpmem>: <3> [mmap.c:101 util_mmap_fini] 

pmemblk7.log below.
obj_basic_integration/TEST7 pmemblk7.log <libpmemblk>: <1> [out.c:236 out_init] pid 28578: program: /home/jenkins/greg/pmdk/src/tools/pmempool/pmempool.static-debug
obj_basic_integration/TEST7 pmemblk7.log <libpmemblk>: <1> [out.c:238 out_init] libpmemblk version 1.1
obj_basic_integration/TEST7 pmemblk7.log <libpmemblk>: <1> [out.c:242 out_init] src version: 1.6+git237.g3ca1cd956
obj_basic_integration/TEST7 pmemblk7.log <libpmemblk>: <1> [out.c:250 out_init] compiled with support for Valgrind pmemcheck
obj_basic_integration/TEST7 pmemblk7.log <libpmemblk>: <1> [out.c:255 out_init] compiled with support for Valgrind helgrind
obj_basic_integration/TEST7 pmemblk7.log <libpmemblk>: <1> [out.c:260 out_init] compiled with support for Valgrind memcheck
obj_basic_integration/TEST7 pmemblk7.log <libpmemblk>: <1> [out.c:265 out_init] compiled with support for Valgrind drd
obj_basic_integration/TEST7 pmemblk7.log <libpmemblk>: <1> [out.c:270 out_init] compiled with support for shutdown state
obj_basic_integration/TEST7 pmemblk7.log <libpmemblk>: <1> [out.c:275 out_init] compiled with libndctl 63+
obj_basic_integration/TEST7 pmemblk7.log <libpmemblk>: <3> [mmap.c:67 util_mmap_init] 
obj_basic_integration/TEST7 pmemblk7.log <libpmemblk>: <3> [libpmemblk.c:118 libpmemblk_init] 
obj_basic_integration/TEST7 pmemblk7.log <libpmemblk>: <3> [libpmemblk.c:130 libpmemblk_fini] 
obj_basic_integration/TEST7 pmemblk7.log <libpmemblk>: <3> [mmap.c:101 util_mmap_fini] 

pmemlog7.log below.
obj_basic_integration/TEST7 pmemlog7.log <libpmemlog>: <1> [out.c:236 out_init] pid 28578: program: /home/jenkins/greg/pmdk/src/tools/pmempool/pmempool.static-debug
obj_basic_integration/TEST7 pmemlog7.log <libpmemlog>: <1> [out.c:238 out_init] libpmemlog version 1.1
obj_basic_integration/TEST7 pmemlog7.log <libpmemlog>: <1> [out.c:242 out_init] src version: 1.6+git237.g3ca1cd956
obj_basic_integration/TEST7 pmemlog7.log <libpmemlog>: <1> [out.c:250 out_init] compiled with support for Valgrind pmemcheck
obj_basic_integration/TEST7 pmemlog7.log <libpmemlog>: <1> [out.c:255 out_init] compiled with support for Valgrind helgrind
obj_basic_integration/TEST7 pmemlog7.log <libpmemlog>: <1> [out.c:260 out_init] compiled with support for Valgrind memcheck
obj_basic_integration/TEST7 pmemlog7.log <libpmemlog>: <1> [out.c:265 out_init] compiled with support for Valgrind drd
obj_basic_integration/TEST7 pmemlog7.log <libpmemlog>: <1> [out.c:270 out_init] compiled with support for shutdown state
obj_basic_integration/TEST7 pmemlog7.log <libpmemlog>: <1> [out.c:275 out_init] compiled with libndctl 63+
obj_basic_integration/TEST7 pmemlog7.log <libpmemlog>: <3> [mmap.c:67 util_mmap_init] 
obj_basic_integration/TEST7 pmemlog7.log <libpmemlog>: <3> [libpmemlog.c:118 libpmemlog_init] 
obj_basic_integration/TEST7 pmemlog7.log <libpmemlog>: <3> [libpmemlog.c:130 libpmemlog_fini] 
obj_basic_integration/TEST7 pmemlog7.log <libpmemlog>: <3> [mmap.c:101 util_mmap_fini] 

Last 30 lines of pmemobj7.log below (whole file has 42 lines).
obj_basic_integration/TEST7 pmemobj7.log <libpmemobj>: <3> [obj.c:183 obj_ctl_init_and_load] pop (nil)
obj_basic_integration/TEST7 pmemobj7.log <libpmemobj>: <3> [ctl.c:424 ctl_load_config_from_string] ctl (nil) ctx (nil) cfg_string "fallocate.at_create=0;"
obj_basic_integration/TEST7 pmemobj7.log <libpmemobj>: <3> [ctl.c:301 ctl_query] ctl (nil) ctx (nil) source 2 name fallocate.at_create type 1 arg 0x8e8ed4
obj_basic_integration/TEST7 pmemobj7.log <libpmemobj>: <3> [ctl.c:79 ctl_find_node] nodes 0x7f5d47644900 name fallocate.at_create indexes 0x7ffc08eac318
obj_basic_integration/TEST7 pmemobj7.log <libpmemobj>: <3> [set.c:124 util_remote_init] 
obj_basic_integration/TEST7 pmemobj7.log <libpmemobj>: <3> [obj.c:1328 pmemobj_createU] path /dev/dax1.5 layout basic poolsize 0 mode 600
obj_basic_integration/TEST7 pmemobj7.log <libpmemobj>: <3> [obj.c:1298 obj_get_nlanes] 
obj_basic_integration/TEST7 pmemobj7.log <libpmemobj>: <3> [set.c:3355 util_pool_create] setp 0x7ffc08eac308 path /dev/dax1.5 poolsize 0 minsize 8388608 minpartsize 2097152 attr 0x7ffc08eac290 nlanes 0x7ffc08eac304 can_have_rep 1
obj_basic_integration/TEST7 pmemobj7.log <libpmemobj>: <3> [set.c:3115 util_pool_create_uuids] setp 0x7ffc08eac308 path /dev/dax1.5 poolsize 0 minsize 8388608 minpartsize 2097152 pattr 0x7ffc08eac290 nlanes 0x7ffc08eac304 can_have_rep 1 remote 0
obj_basic_integration/TEST7 pmemobj7.log <libpmemobj>: <3> [file.c:131 util_file_exists] path "/dev/dax1.5"
obj_basic_integration/TEST7 pmemobj7.log <libpmemobj>: <3> [set.c:2124 util_poolset_create_set] setp 0x7ffc08eac308 path /dev/dax1.5 poolsize 0 minsize 8388608
obj_basic_integration/TEST7 pmemobj7.log <libpmemobj>: <3> [file.c:223 util_file_get_type] path "/dev/dax1.5"
obj_basic_integration/TEST7 pmemobj7.log <libpmemobj>: <3> [file.c:131 util_file_exists] path "/dev/dax1.5"
obj_basic_integration/TEST7 pmemobj7.log <libpmemobj>: <3> [file.c:559 util_file_open] path "/dev/dax1.5" size 0x7ffc08eac118 minsize 0 flags 0
obj_basic_integration/TEST7 pmemobj7.log <libpmemobj>: <3> [file.c:258 util_file_get_size] path "/dev/dax1.5"
obj_basic_integration/TEST7 pmemobj7.log <libpmemobj>: <3> [file.c:223 util_file_get_type] path "/dev/dax1.5"
obj_basic_integration/TEST7 pmemobj7.log <libpmemobj>: <3> [file.c:131 util_file_exists] path "/dev/dax1.5"
obj_basic_integration/TEST7 pmemobj7.log <libpmemobj>: <3> [file.c:67 device_dax_size] path "/dev/dax1.5"
obj_basic_integration/TEST7 pmemobj7.log <libpmemobj>: <3> [set.c:1703 util_poolset_single] path /dev/dax1.5 filesize 6339690496 create 0
obj_basic_integration/TEST7 pmemobj7.log <libpmemobj>: <3> [file.c:223 util_file_get_type] path "/dev/dax1.5"
obj_basic_integration/TEST7 pmemobj7.log <libpmemobj>: <3> [file.c:131 util_file_exists] path "/dev/dax1.5"
obj_basic_integration/TEST7 pmemobj7.log <libpmemobj>: <3> [file_posix.c:291 util_file_device_dax_alignment] path "/dev/dax1.5"
obj_basic_integration/TEST7 pmemobj7.log <libpmemobj>: <3> [file_posix.c:212 device_dax_alignment] path "/dev/dax1.5"
obj_basic_integration/TEST7 pmemobj7.log <libpmemobj>: <1> [file_posix.c:230 device_dax_alignment] open "/sys/dev/char/252:11/device/align": No such file or directory
obj_basic_integration/TEST7 pmemobj7.log <libpmemobj>: <1> [set.c:1752 util_poolset_single] assertion failure: rep->part[0].alignment (0x0) != 0 (0x0)
obj_basic_integration/TEST7 pmemobj7.log <libpmemobj>: <3> [libpmemobj.c:65 libpmemobj_fini] 
obj_basic_integration/TEST7 pmemobj7.log <libpmemobj>: <3> [obj.c:325 obj_fini] 
obj_basic_integration/TEST7 pmemobj7.log <libpmemobj>: <3> [set.c:139 util_remote_fini] 
obj_basic_integration/TEST7 pmemobj7.log <libpmemobj>: <3> [set.c:194 util_remote_unload] 
obj_basic_integration/TEST7 pmemobj7.log <libpmemobj>: <3> [mmap.c:101 util_mmap_fini] 

pmempool7.log below.
obj_basic_integration/TEST7 pmempool7.log <libpmempool>: <1> [out.c:236 out_init] pid 28578: program: /home/jenkins/greg/pmdk/src/tools/pmempool/pmempool.static-debug
obj_basic_integration/TEST7 pmempool7.log <libpmempool>: <1> [out.c:238 out_init] libpmempool version 1.3
obj_basic_integration/TEST7 pmempool7.log <libpmempool>: <1> [out.c:242 out_init] src version: 1.6+git237.g3ca1cd956
obj_basic_integration/TEST7 pmempool7.log <libpmempool>: <1> [out.c:250 out_init] compiled with support for Valgrind pmemcheck
obj_basic_integration/TEST7 pmempool7.log <libpmempool>: <1> [out.c:255 out_init] compiled with support for Valgrind helgrind
obj_basic_integration/TEST7 pmempool7.log <libpmempool>: <1> [out.c:260 out_init] compiled with support for Valgrind memcheck
obj_basic_integration/TEST7 pmempool7.log <libpmempool>: <1> [out.c:265 out_init] compiled with support for Valgrind drd
obj_basic_integration/TEST7 pmempool7.log <libpmempool>: <1> [out.c:270 out_init] compiled with support for shutdown state
obj_basic_integration/TEST7 pmempool7.log <libpmempool>: <1> [out.c:275 out_init] compiled with libndctl 63+
obj_basic_integration/TEST7 pmempool7.log <libpmempool>: <3> [mmap.c:67 util_mmap_init] 
obj_basic_integration/TEST7 pmempool7.log <libpmempool>: <3> [libpmempool.c:69 libpmempool_init] 
obj_basic_integration/TEST7 pmempool7.log <libpmempool>: <3> [set.c:124 util_remote_init] 
obj_basic_integration/TEST7 pmempool7.log <libpmempool>: <3> [libpmempool.c:85 libpmempool_fini] 
obj_basic_integration/TEST7 pmempool7.log <libpmempool>: <3> [set.c:194 util_remote_unload] 
obj_basic_integration/TEST7 pmempool7.log <libpmempool>: <3> [set.c:139 util_remote_fini] 
obj_basic_integration/TEST7 pmempool7.log <libpmempool>: <3> [set.c:194 util_remote_unload] 
obj_basic_integration/TEST7 pmempool7.log <libpmempool>: <3> [mmap.c:101 util_mmap_fini] 

rpmem7.log below.
obj_basic_integration/TEST7 rpmem7.log <librpmem>: <1> [out.c:236 out_init] pid 28578: program: /home/jenkins/greg/pmdk/src/tools/pmempool/pmempool.static-debug
obj_basic_integration/TEST7 rpmem7.log <librpmem>: <1> [out.c:238 out_init] librpmem version 1.3
obj_basic_integration/TEST7 rpmem7.log <librpmem>: <1> [out.c:242 out_init] src version: 1.6+git237.g3ca1cd956
obj_basic_integration/TEST7 rpmem7.log <librpmem>: <1> [out.c:250 out_init] compiled with support for Valgrind pmemcheck
obj_basic_integration/TEST7 rpmem7.log <librpmem>: <1> [out.c:255 out_init] compiled with support for Valgrind helgrind
obj_basic_integration/TEST7 rpmem7.log <librpmem>: <1> [out.c:260 out_init] compiled with support for Valgrind memcheck
obj_basic_integration/TEST7 rpmem7.log <librpmem>: <1> [out.c:265 out_init] compiled with support for Valgrind drd
obj_basic_integration/TEST7 rpmem7.log <librpmem>: <1> [out.c:270 out_init] compiled with support for shutdown state
obj_basic_integration/TEST7 rpmem7.log <librpmem>: <1> [out.c:275 out_init] compiled with libndctl 63+
obj_basic_integration/TEST7 rpmem7.log <librpmem>: <3> [librpmem.c:61 librpmem_init] 
obj_basic_integration/TEST7 rpmem7.log <librpmem>: <3> [librpmem.c:77 librpmem_fini] 

trace7.log below.
obj_basic_integration/TEST7 trace7.log {obj_basic_integration.c:653 main} obj_basic_integration/TEST7: START: obj_basic_integration
obj_basic_integration/TEST7 trace7.log  ./obj_basic_integration /dev/dax1.5

RUNTESTS: stopping: obj_basic_integration//TEST7 failed, TEST=check FS=pmem BUILD=debug

How often bug is revealed: (always, often, rare):

always

Actual behavior:

As above.

Expected behavior:

Tests should pass.

Details

Additional information about Priority and Help Requested:

Are you willing to submit a pull request with a proposed change? (Yes, No)

Requested priority: (Showstopper, High, Medium, Low)

pbalcer commented 5 years ago

The first problem is that an ASSERT is triggered when alignment file isn't found. This should just fail the open/create instead. This is on us to fix.

The other problem is that a kernel interface disappeared in a minor version release. Now, the long-term solution, since we already have libndctl dependency, is to simply purge all manual sysfs parsing and instead retrieve that from ndctl. However, the bigger problem is that current versions of PMDK will be broken with newer kernels...

pbalcer commented 5 years ago

This is the function that fails: https://github.com/pmem/pmdk/blob/master/src/common/file_posix.c#L210

JeffMoyer commented 5 years ago

Which file disappeared, exactly? Is this related with the conversion from a bus to a class device model?

djbw commented 5 years ago

It does look like the bus model also changes the relative location of the align value:

In the class device case it's here: $(readlink -f /sys/dev/char/253\:1)/../../dax_region/align

In the bus case it's here: $(readlink -f /sys/dev/char/253\:1)/../dax_region/align

djbw commented 5 years ago

That said, this path referenced in the code is not the device-dax instance alignment:

/sys/dev/char/%u:%u/device/align

That's the parent NVDIMM configuration device that, for maximum confusion, also happens to be named "daxX.Y". The values are effectively the same, but in a future where not all device-dax instances are parented by NVDIMM devices that path is looking in the wrong location and is guaranteed to break.

I think the path forward is to teach this code to walk the hierarchy back to the dax_region. I'll send a patch.

djbw commented 5 years ago

Here is the proposed routine, it walks the device-path looking for 'dax_region':

    int dax_align(char *daxpath, unsigned long *align)
    {
            char path[PATH_MAX];
            struct stat st;
            int rc;

            if (!align)
                    return EINVAL;

            rc = stat(daxpath, &st);
            if (rc)
                    return errno;

            sprintf(path, "/sys/dev/char/%u:%u", major(st.st_rdev), minor(st.st_rdev));
            daxpath = realpath(path, NULL);
            if (!daxpath)
                    return errno;

            strncpy(path, daxpath, sizeof(path) - 1);
            path[sizeof(path) - 1] = '\0';

            while (path[0] == '/') {
                    char *pos = strrchr(path, '/');
                    char buf[40];
                    int fd, len;

                    if (!pos)
                            break;
                    *pos = '\0';
                    len = strlen(path);

                    snprintf(&path[len], sizeof(path) - len, "/dax_region/align");
                    fd = open(path, O_RDONLY);
                    *pos = '\0';

                    if (fd < 0)
                            continue;

                    rc = read(fd, buf, sizeof(buf));
                    close(fd);

                    if (rc < 0)
                            return errno;
                    *align = strtoul(buf, NULL, 0);
                    return 0;
            }

            return ENOENT;
    }
Greg091 commented 5 years ago

Verified on: 1.6-114-g7a39dd66e