pmem / issues

Old issues repo for PMDK.
http://pmem.io
13 stars 7 forks source link

RUNTESTS: stopping: pmempool_create/TEST10 failed, TEST=check FS=non-pmem BUILD=debug #897

Closed zhijianli88 closed 6 years ago

zhijianli88 commented 6 years ago

nvml-unit-tests.pmempool_create_TEST10_non-pmem_debug.fail occurs on 0Day since we built nvml with NDCTL_ENABLE=y(previous it is unset) recently. after digging into the test case, we found it exits from line 70 below at v4.17-rc5

lizhijian@inn:/c/repo/nvml/src/test/pmempool_create$ vim TEST10
 63 create_poolset $POOLSET AUTO:$FULLDEV:x
 64
 65 expect_normal_exit $PMEMPOOL$EXESUFFIX rm $POOLSET
 66
 67 # inject bad block: OFF=11 LEN=12
 68 ndctl_inject_error $NAMESPACE 11 12
 69
 70 expect_abnormal_exit $PMEMPOOL$EXESUFFIX create obj --layout pmempool$SUFFIX $POOLSET &> $LOG       <<<<=== failed at here
 71
 72 ndctl_nfit_test_fini

Did you meet the similar issues. if i miss something at buiding or runing test, please let me know. if you need more details/log, let me know again

ldorau commented 6 years ago

Hi @zhijianli88, I suspect that ndctl failed to inject bad blocks (in the line number 68 in this snippet). Please test https://github.com/pmem/pmdk/pull/3046 and let me know the result and post the logs. This PR does not fix anything, it will just help us to debug this issue.

zhijianli88 commented 6 years ago

root@lkp-hsw-ep4 /lkp/benchmarks/nvml-unit-tests/src/test# ./RUNTESTS -f non-pmem pmempool_create ...

pmempool_create/TEST9: PASS pmempool_create/TEST9: SETUP (check/non-pmem/static-nondebug) pmempool_create/TEST9: PASS pmempool_create/TEST10: SETUP (check/non-pmem/debug) Error: ndctl failed to inject or retain bad blocks RUNTESTS: stopping: pmempool_create/TEST10 failed, TEST=check FS=non-pmem BUILD=debug root@lkp-hsw-ep4 /lkp/benchmarks/nvml-unit-tests/src/test# cat pmempool_create/out cat: pmempool_create/out: No such file or directory root@lkp-hsw-ep4 /lkp/benchmarks/nvml-unit-tests/src/test# cat pmempool_create/out10.log root@lkp-hsw-ep4 /lkp/benchmarks/nvml-unit-tests/src/test#

add set -x to script

pmempool_create/TEST10: SETUP (check/non-pmem/debug)

On 07/02/2018 06:41 PM, Lukasz Dorau wrote:

Hi @zhijianli88 https://github.com/zhijianli88, I suspect that ndctl failed to inject bad blocks (in the line number 68 in this snippet). Please test pmem/pmdk#3046 https://github.com/pmem/pmdk/pull/3046 and let me know the result and post the logs. This PR does not fix anything, it will just help us to debug this issue.

marcinslusarz commented 6 years ago

ndctl version?

zhijianli88 commented 6 years ago

root@lkp-hsw-ep4 ~# ndctl --version 60.25.g6b0d7dd

On 07/03/2018 12:05 AM, Marcin Ślusarz wrote:

ndctl version?

ldorau commented 6 years ago

Please post the pmempool_create/prep10.log and pmempool_create/out10.log log files.

zhijianli88 commented 6 years ago

root@lkp-hsw-ep4 /lkp/benchmarks/nvml-unit-tests/src/test# cat pmempool_create/prep10.log disabled 0 regions disabled 8 regions zeroed 6 nmems enabled 8 regions { "dev":"namespace0.0", "mode":"devdax", "map":"dev", "size":30412800, "uuid":"610d117f-eb1c-4560-9934-283e4ab44c9a", "raw_uuid":"2dcfe1c1-38b2-4d38-81fe-9ea6b035ddad", "chardev":"dax0.0" }

On 07/03/2018 06:23 PM, Lukasz Dorau wrote:

Please post the |pmempool_create/prep10.log| log file

ldorau commented 6 years ago

It should look like that:

disabled 4 regions
disabled 12 regions
zeroed 4 nmems
enabled 12 regions
{
  "dev":"namespace8.0",
  "mode":"devdax",
  "map":"dev",
  "size":29364224,
  "uuid":"13d11f54-1813-4745-b871-5772f8539824",
  "raw_uuid":"513638f0-a895-45e4-a922-81473fdf004b",
  "chardev":"dax8.0",
  "badblock_count":1,
  "badblocks":[
    {
      "offset":11,
      "length":1,
      "dimms":[
        "nmem1"
      ]
    }
  ]
}
disabled 12 regions

So the part:

  "badblock_count":1,
  "badblocks":[
    {
      "offset":11,
      "length":1,
      "dimms":[
        "nmem1"
      ]
    }

is missing in your log - it means that ndctl failed to insert bad blocks. Now the question is why?....

ldorau commented 6 years ago

Do the util_badblock/TEST[2-9] tests succeed?

zhijianli88 commented 6 years ago

root@lkp-hsw-ep4 /lkp/benchmarks/nvml-unit-tests/src/test# ./RUNTESTS -f non-pmem util_badblock/ util_badblock/TEST0: SETUP (check/non-pmem/debug) util_badblock/TEST0: PASS util_badblock/TEST0: SETUP (check/non-pmem/nondebug) util_badblock/TEST0: PASS util_badblock/TEST0: SETUP (check/non-pmem/static-debug) util_badblock/TEST0: PASS util_badblock/TEST0: SETUP (check/non-pmem/static-nondebug) util_badblock/TEST0: PASS util_badblock/TEST1: SKIP DEVICE_DAX_PATH does not specify enough dax devices (min: 1) util_badblock/TEST1: SKIP DEVICE_DAX_PATH does not specify enough dax devices (min: 1) util_badblock/TEST1: SKIP DEVICE_DAX_PATH does not specify enough dax devices (min: 1) util_badblock/TEST1: SKIP DEVICE_DAX_PATH does not specify enough dax devices (min: 1) util_badblock/TEST2: SETUP (check/non-pmem/debug) util_badblock/TEST2: PASS util_badblock/TEST2: SETUP (check/non-pmem/nondebug) util_badblock/TEST2: PASS util_badblock/TEST2: SETUP (check/non-pmem/static-debug) util_badblock/TEST2: PASS util_badblock/TEST2: SETUP (check/non-pmem/static-nondebug) util_badblock/TEST2: PASS util_badblock/TEST3: SETUP (check/non-pmem/debug) Error: ndctl failed to inject or retain bad blocks RUNTESTS: stopping: util_badblock//TEST3 failed, TEST=check FS=non-pmem BUILD=debug

root@lkp-hsw-ep4 /lkp/benchmarks/nvml-unit-tests/src/test# cat util_badblock/out3.log root@lkp-hsw-ep4 /lkp/benchmarks/nvml-unit-tests/src/test# cat util_badblock/prep3.log disabled 0 regions disabled 8 regions zeroed 6 nmems enabled 8 regions { "dev":"namespace0.0", "mode":"devdax", "map":"dev", "size":30412800, "uuid":"ef964d96-f0d4-4b7d-9ea9-d8bb5c2b8fd2", "raw_uuid":"54696d0a-f877-4dc8-8ff6-bdc4817b6a24", "chardev":"dax0.0" } root@lkp-hsw-ep4 /lkp/benchmarks/nvml-unit-tests/src/test# uname -a Linux lkp-hsw-ep4 4.18.0-rc1 #1 SMP Tue Jul 3 13:15:38 CST 2018 x86_64 GNU/Linux root@lkp-hsw-ep4 /lkp/benchmarks/nvml-unit-tests/src/test# cat /proc/cmdline ip=::::lkp-hsw-ep4::dhcp root=/dev/ram0 user=lkp job=/lkp/scheduled/lkp-hsw-ep4/nvml-unit-tests-pmempool-non-pmem-debian-x86_64-2018-04-03.cgz-ce397d215ccd07b8ae3f71db689aedb85d56ab40-20180703-70034-34hb38-0.yaml ARCH=x86_64 kconfig=x86_64-rhel-7.2 branch=linus/master commit=ce397d215ccd07b8ae3f71db689aedb85d56ab40 BOOT_IMAGE=/pkg/linux/x86_64-rhel-7.2/gcc-7/ce397d215ccd07b8ae3f71db689aedb85d56ab40/vmlinuz-4.18.0-rc1 max_uptime=1230 RESULT_ROOT=/result/nvml-unit-tests/pmempool-non-pmem/lkp-hsw-ep4/debian-x86_64-2018-04-03.cgz/x86_64-rhel-7.2/gcc-7/ce397d215ccd07b8ae3f71db689aedb85d56ab40/e9b36bc73846a7b4199318898fe65b035bd451d6/6 LKP_SERVER=inn debug apic=debug sysrq_always_enabled rcupdate.rcu_cpu_stall_timeout=100 net.ifnames=0 printk.devkmsg=on panic=-1 softlockup_panic=1 nmi_watchdog=panic oops=panic load_ramdisk=2 prompt_ramdisk=0 drbd.minor_count=8 systemd.log_level=err ignore_loglevel console=tty0 earlyprintk=ttyS0,115200 console=ttyS0,115200 vga=normal rw root@lkp-hsw-ep4 /lkp/benchmarks/nvml-unit-tests/src/test#

On 07/03/2018 10:15 PM, Lukasz Dorau wrote:

Do the util_badblock/TEST[2-9] tests succeeds?

ldorau commented 6 years ago

Do the following commands:

$ sudo modprobe nfit_test
$ lsmod | grep nfit_test

succeed on your machine and what is the output?

zhijianli88 commented 6 years ago

root@lkp-hsw-ep4 /lkp/benchmarks/nvml-unit-tests/src/test# lsmod | grep nfit nfit_test 36864 8 nd_pmem 20480 1 nfit_test nfit 61440 1 nfit_test device_dax 20480 2 dax_pmem,nfit_test libnvdimm 163840 6 dax_pmem,nfit_test,nd_btt,nd_pmem,nd_blk,nfit nfit_test_iomap 24576 6 dax_pmem,nfit_test,device_dax,nd_pmem,libnvdimm,nfit

On 07/03/2018 11:08 PM, Lukasz Dorau wrote:

Do the following commands:

|$ sudo modprobe nfit_test $ lsmod | grep nfit_test |

succeed on your machine and what is the output?

zhijianli88 commented 6 years ago

root@lkp-hsw-ep4 /lkp/benchmarks/nvml-unit-tests/src/test# cat testconfig.sh NON_PMEM_FS_DIR=/tmp/tmp.KckJQLjz1P PMEM_FS_DIR=/fs/pmem0 NODE[0]=127.0.0.1 NODE_WORKING_DIR[0]=/tmp/node0 NODE_ADDR[0]=127.0.0.1 NODE_ENV[0]="PMEM_IS_PMEM_FORCE=1" NODE[1]=127.0.0.1 NODE_WORKING_DIR[1]=/tmp/node1 NODE_ADDR[1]=127.0.0.1 NODE_ENV[1]="PMEM_IS_PMEM_FORCE=1" NODE[2]=127.0.0.1 NODE_WORKING_DIR[2]=/tmp/node2 NODE_ADDR[2]=127.0.0.1 NODE_ENV[2]="PMEM_IS_PMEM_FORCE=1" NODE[3]=127.0.0.1 NODE_WORKING_DIR[3]=/tmp/node3 NODE_ADDR[3]=127.0.0.1 NODE_ENV[3]="PMEM_IS_PMEM_FORCE=1" TEST_PROVIDERS=sockets RPMEM_VALGRIND_ENABLED=y PMEM_FS_DIR_FORCE_PMEM=1 root@lkp-hsw-ep4 /lkp/benchmarks/nvml-unit-tests/src/test# mount rootfs on / type rootfs (rw) sysfs on /sys type sysfs (rw,nosuid,nodev,noexec,relatime) proc on /proc type proc (rw,nosuid,nodev,noexec,relatime) devtmpfs on /dev type devtmpfs (rw,nosuid,size=65623640k,nr_inodes=16405910,mode=755) securityfs on /sys/kernel/security type securityfs (rw,nosuid,nodev,noexec,relatime) selinuxfs on /sys/fs/selinux type selinuxfs (rw,relatime) tmpfs on /dev/shm type tmpfs (rw,nosuid,nodev) devpts on /dev/pts type devpts (rw,nosuid,noexec,relatime,gid=5,mode=620,ptmxmode=000) tmpfs on /run type tmpfs (rw,nosuid,nodev,mode=755) tmpfs on /run/lock type tmpfs (rw,nosuid,nodev,noexec,relatime,size=5120k) tmpfs on /sys/fs/cgroup type tmpfs (ro,nosuid,nodev,noexec,mode=755) cgroup on /sys/fs/cgroup/systemd type cgroup (rw,nosuid,nodev,noexec,relatime,xattr,release_agent=/lib/systemd/systemd-cgroups-agent,name=systemd) pstore on /sys/fs/pstore type pstore (rw,nosuid,nodev,noexec,relatime) cgroup on /sys/fs/cgroup/perf_event type cgroup (rw,nosuid,nodev,noexec,relatime,perf_event) cgroup on /sys/fs/cgroup/cpu type cgroup (rw,nosuid,nodev,noexec,relatime,cpu) cgroup on /sys/fs/cgroup/devices type cgroup (rw,nosuid,nodev,noexec,relatime,devices) cgroup on /sys/fs/cgroup/hugetlb type cgroup (rw,nosuid,nodev,noexec,relatime,hugetlb) cgroup on /sys/fs/cgroup/pids type cgroup (rw,nosuid,nodev,noexec,relatime,pids) cgroup on /sys/fs/cgroup/cpuset type cgroup (rw,nosuid,nodev,noexec,relatime,cpuset) cgroup on /sys/fs/cgroup/freezer type cgroup (rw,nosuid,nodev,noexec,relatime,freezer) cgroup on /sys/fs/cgroup/memory type cgroup (rw,nosuid,nodev,noexec,relatime,memory) cgroup on /sys/fs/cgroup/rdma type cgroup (rw,nosuid,nodev,noexec,relatime,rdma) cgroup on /sys/fs/cgroup/net_cls type cgroup (rw,nosuid,nodev,noexec,relatime,net_cls) cgroup on /sys/fs/cgroup/blkio type cgroup (rw,nosuid,nodev,noexec,relatime,blkio) systemd-1 on /proc/sys/fs/binfmt_misc type autofs (rw,relatime,fd=35,pgrp=1,timeout=0,minproto=5,maxproto=5,direct,pipe_ino=51333) hugetlbfs on /dev/hugepages type hugetlbfs (rw,relatime,pagesize=2M) debugfs on /sys/kernel/debug type debugfs (rw,relatime) sunrpc on /run/rpc_pipefs type rpc_pipefs (rw,relatime) mqueue on /dev/mqueue type mqueue (rw,relatime) configfs on /sys/kernel/config type configfs (rw,relatime) tmp on /tmp type tmpfs (rw,relatime) inn:/result on /inn/result type nfs (rw,relatime,vers=3,rsize=1048576,wsize=1048576,namlen=255,hard,proto=tcp,timeo=600,retrans=2,sec=sys,mountaddr=192.168.1.1,mountvers=3,mountport=42102,mountproto=udp,local_lock=none,addr=192.168.1.1) inn:/pkg on /pkg type nfs (rw,relatime,vers=3,rsize=1048576,wsize=1048576,namlen=255,hard,proto=tcp,timeo=600,retrans=2,sec=sys,mountaddr=192.168.1.1,mountvers=3,mountport=42102,mountproto=udp,local_lock=none,addr=192.168.1.1) tmpfs on /run/user/0 type tmpfs (rw,nosuid,nodev,relatime,size=13191772k,mode=700)

On 07/03/2018 11:12 PM, Li Zhijian wrote:

root@lkp-hsw-ep4 /lkp/benchmarks/nvml-unit-tests/src/test# lsmod | grep nfit nfit_test 36864 8 nd_pmem 20480 1 nfit_test nfit 61440 1 nfit_test device_dax 20480 2 dax_pmem,nfit_test libnvdimm 163840 6 dax_pmem,nfit_test,nd_btt,nd_pmem,nd_blk,nfit nfit_test_iomap 24576 6 dax_pmem,nfit_test,device_dax,nd_pmem,libnvdimm,nfit

On 07/03/2018 11:08 PM, Lukasz Dorau wrote:

Do the following commands:

|$ sudo modprobe nfit_test $ lsmod | grep nfit_test |

succeed on your machine and what is the output?

ldorau commented 6 years ago

It looks like injecting errors by ndctl does not work on your machine with ndctl v60.25.g6b0d7dd and kernel v4.18.0-rc1. I will check it.

ldorau commented 6 years ago

Where do you have this version of ndctl (ndctl v60.25.g6b0d7dd) from? There is no such tag (v60.25) nor the commit ID (6b0d7dd) in the ndctl's git tree...

zhijianli88 commented 6 years ago

we used the pending branch https://github.com/pmem/ndctl --branch pending

ldorau commented 6 years ago

Please test the latest stable release v61.2 and check if the results are the same.

zhijianli88 commented 6 years ago

Looks the latest release tag is v60.1 root@lkp-nex04 ~/ndctl# git tag | grep v61 v61 v61.1

On 07/04/2018 12:45 PM, Lukasz Dorau wrote:

Please test the latest stable release v61.2 and check if the results are the same.

zhijianli88 commented 6 years ago

still fails

pmempool_create/TEST10: SETUP (check/non-pmem/debug) Error: ndctl failed to inject or retain bad blocks RUNTESTS: stopping: pmempool_create/TEST10 failed, TEST=check FS=non-pmem BUILD=debug root@lkp-hsw-ep4 /lkp/benchmarks/nvml-unit-tests/src/test# ndctl --version 61.1

On 07/04/2018 12:45 PM, Lukasz Dorau wrote:

Please test the latest stable release v61.2 and check if the results are the same.

ldorau commented 6 years ago

Could you test the stable kernel? The latest stable is 4.17.4.

zhijianli88 commented 6 years ago

It doesn't work

pmempool_create/TEST9: SETUP (check/non-pmem/static-debug) pmempool_create/TEST9: PASS pmempool_create/TEST9: SETUP (check/non-pmem/static-nondebug) pmempool_create/TEST9: PASS pmempool_create/TEST10: SETUP (check/non-pmem/debug) Error: ndctl failed to inject or retain bad blocks RUNTESTS: stopping: pmempool_create/TEST10 failed, TEST=check FS=non-pmem BUILD=debug root@lkp-hsw-ep4 /lkp/benchmarks/nvml-unit-tests/src/test# ndctl --version 60.1 root@lkp-hsw-ep4 /lkp/benchmarks/nvml-unit-tests/src/test# cat pmempool_create/pmempool10.log

: <1> [out.c:236 out_init] pid 33189: program: /lkp/benchmarks/nvml-unit-tests/src/tools/pmempool/pmempool : <1> [out.c:238 out_init] libpmempool version 1.3 : <1> [out.c:242 out_init] src version: 1.4-rc4-606-ge9b36bc73846 : <1> [out.c:250 out_init] compiled with support for Valgrind pmemcheck : <1> [out.c:255 out_init] compiled with support for Valgrind helgrind : <1> [out.c:260 out_init] compiled with support for Valgrind memcheck : <1> [out.c:265 out_init] compiled with support for Valgrind drd : <3> [mmap.c:66 util_mmap_init] : <3> [libpmempool.c:69 libpmempool_init] : <3> [set.c:121 util_remote_init] : <3> [libpmempool.c:85 libpmempool_fini] : <3> [set.c:191 util_remote_unload] : <3> [set.c:136 util_remote_fini] : <3> [set.c:191 util_remote_unload] : <3> [mmap.c:100 util_mmap_fini] root@lkp-hsw-ep4 /lkp/benchmarks/nvml-unit-tests/src/test# cat pmempool_create/prep10.log disabled 8 regions disabled 8 regions zeroed 6 nmems enabled 8 regions { "dev":"namespace0.0", "mode":"devdax", "size":30674944, "uuid":"839707f2-afb5-48c0-a880-976a172896ae", "raw_uuid":"d748be2f-d32a-47fd-9520-74bda3214d5d", "chardev":"dax0.0" } root@lkp-hsw-ep4 /lkp/benchmarks/nvml-unit-tests/src/test# uname -a Linux lkp-hsw-ep4 4.17.4 #1 SMP Wed Jul 4 12:46:41 CST 2018 x86_64 GNU/Linux root@lkp-hsw-ep4 /lkp/benchmarks/nvml-unit-tests/src/test# lscpu Architecture: x86_64 CPU op-mode(s): 32-bit, 64-bit Byte Order: Little Endian CPU(s): 72 On-line CPU(s) list: 0-71 Thread(s) per core: 2 Core(s) per socket: 18 Socket(s): 2 NUMA node(s): 2 Vendor ID: GenuineIntel CPU family: 6 Model: 63 Model name: Intel(R) Xeon(R) CPU E5-2699 v3 @ 2.30GHz Stepping: 2 CPU MHz: 3599.165 CPU max MHz: 3600.0000 CPU min MHz: 1200.0000 BogoMIPS: 4589.48 Virtualization: VT-x L1d cache: 32K L1i cache: 32K L2 cache: 256K L3 cache: 46080K NUMA node0 CPU(s): 0-17,36-53 NUMA node1 CPU(s): 18-35,54-71 Flags: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc cpuid aperfmperf pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 sdbg fma cx16 xtpr pdcm pcid dca sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm abm cpuid_fault epb invpcid_single pti intel_ppin ibrs ibpb stibp tpr_shadow vnmi flexpriority ept vpid fsgsbase tsc_adjust bmi1 avx2 smep bmi2 erms invpcid cqm xsaveopt cqm_llc cqm_occup_llc dtherm ida arat pln pts root@lkp-hsw-ep4 /lkp/benchmarks/nvml-unit-tests/src/test/pmempool_create# cat pmem10.log : <1> [out.c:236 out_init] pid 19469: program: /lkp/benchmarks/nvml-unit-tests/src/tools/pmempool/pmempool : <1> [out.c:238 out_init] libpmem version 1.1 : <1> [out.c:242 out_init] src version: 1.4-rc4-606-ge9b36bc73846 : <1> [out.c:250 out_init] compiled with support for Valgrind pmemcheck : <1> [out.c:255 out_init] compiled with support for Valgrind helgrind : <1> [out.c:260 out_init] compiled with support for Valgrind memcheck : <1> [out.c:265 out_init] compiled with support for Valgrind drd : <3> [mmap.c:66 util_mmap_init] : <3> [libpmem.c:56 libpmem_init] : <3> [pmem.c:712 pmem_init] : <3> [init.c:419 pmem_init_funcs] : <3> [init.c:368 pmem_cpuinfo_to_funcs] : <3> [init.c:372 pmem_cpuinfo_to_funcs] clflush supported : <3> [init.c:281 use_avx_memcpy_memset] avx supported : <3> [init.c:285 use_avx_memcpy_memset] PMEM_AVX not set or not == 1 : <3> [pmem.c:216 pmem_has_auto_flush] : <3> [os_auto_flush_linux.c:106 check_domain_in_region] region_path: /sys/bus/nd/devices/region6 : <3> [init.c:472 pmem_init_funcs] Flushing CPU cache : <3> [init.c:487 pmem_init_funcs] using clflush : <3> [init.c:501 pmem_init_funcs] using movnt SSE2 : <3> [pmem_posix.c:104 pmem_os_init] : <3> [libpmem.c:69 libpmem_fini] : <3> [mmap.c:100 util_mmap_fini]
ldorau commented 6 years ago

Thanks. I will test ndctl v61.1 with kernel v4.17.4 and v4.18-rc3.

ldorau commented 6 years ago

One remark. I see you have ndctl version v60.1. Please test kernel v4.17.4 with ndctl v61.1

zhijianli88 commented 6 years ago

kernel v4.17.4 + ndctl v61.1

root@lkp-hsw-ep4 /lkp/benchmarks/nvml-unit-tests/src/test/pmempool_create# cat pmem10.log 
<libpmem>: <1> [out.c:236 out_init] pid 24423: program: /lkp/benchmarks/nvml-unit-tests/src/tools/pmempool/pmempool
<libpmem>: <1> [out.c:238 out_init] libpmem version 1.1
<libpmem>: <1> [out.c:242 out_init] src version: 1.4-rc4-606-ge9b36bc73846
<libpmem>: <1> [out.c:250 out_init] compiled with support for Valgrind pmemcheck
<libpmem>: <1> [out.c:255 out_init] compiled with support for Valgrind helgrind
<libpmem>: <1> [out.c:260 out_init] compiled with support for Valgrind memcheck
<libpmem>: <1> [out.c:265 out_init] compiled with support for Valgrind drd
<libpmem>: <3> [mmap.c:66 util_mmap_init] 
<libpmem>: <3> [libpmem.c:56 libpmem_init] 
<libpmem>: <3> [pmem.c:712 pmem_init] 
<libpmem>: <3> [init.c:419 pmem_init_funcs] 
<libpmem>: <3> [init.c:368 pmem_cpuinfo_to_funcs] 
<libpmem>: <3> [init.c:372 pmem_cpuinfo_to_funcs] clflush supported
<libpmem>: <3> [init.c:281 use_avx_memcpy_memset] avx supported
<libpmem>: <3> [init.c:285 use_avx_memcpy_memset] PMEM_AVX not set or not == 1
<libpmem>: <3> [pmem.c:216 pmem_has_auto_flush] 
<libpmem>: <3> [os_auto_flush_linux.c:106 check_domain_in_region] region_path: /sys/bus/nd/devices/region6
<libpmem>: <3> [init.c:472 pmem_init_funcs] Flushing CPU cache
<libpmem>: <3> [init.c:487 pmem_init_funcs] using clflush
<libpmem>: <3> [init.c:501 pmem_init_funcs] using movnt SSE2
<libpmem>: <3> [pmem_posix.c:104 pmem_os_init] 
<libpmem>: <3> [libpmem.c:69 libpmem_fini] 
<libpmem>: <3> [mmap.c:100 util_mmap_fini] 
root@lkp-hsw-ep4 /lkp/benchmarks/nvml-unit-tests/src/test/pmempool_create# cat pmempool10.log 
<libpmempool>: <1> [out.c:236 out_init] pid 24423: program: /lkp/benchmarks/nvml-unit-tests/src/tools/pmempool/pmempool
<libpmempool>: <1> [out.c:238 out_init] libpmempool version 1.3
<libpmempool>: <1> [out.c:242 out_init] src version: 1.4-rc4-606-ge9b36bc73846
<libpmempool>: <1> [out.c:250 out_init] compiled with support for Valgrind pmemcheck
<libpmempool>: <1> [out.c:255 out_init] compiled with support for Valgrind helgrind
<libpmempool>: <1> [out.c:260 out_init] compiled with support for Valgrind memcheck
<libpmempool>: <1> [out.c:265 out_init] compiled with support for Valgrind drd
<libpmempool>: <3> [mmap.c:66 util_mmap_init] 
<libpmempool>: <3> [libpmempool.c:69 libpmempool_init] 
<libpmempool>: <3> [set.c:121 util_remote_init] 
<libpmempool>: <3> [libpmempool.c:85 libpmempool_fini] 
<libpmempool>: <3> [set.c:191 util_remote_unload] 
<libpmempool>: <3> [set.c:136 util_remote_fini] 
<libpmempool>: <3> [set.c:191 util_remote_unload] 
<libpmempool>: <3> [mmap.c:100 util_mmap_fini] 
root@lkp-hsw-ep4 /lkp/benchmarks/nvml-unit-tests/src/test/pmempool_create# ndctl --version
61.1
root@lkp-hsw-ep4 /lkp/benchmarks/nvml-unit-tests/src/test/pmempool_create# uname -a
Linux lkp-hsw-ep4 4.17.4 #1 SMP Wed Jul 4 12:46:41 CST 2018 x86_64 GNU/Linux
root@lkp-hsw-ep4 /lkp/benchmarks/nvml-unit-tests/src/test/pmempool_create# cat prep10.log 
disabled 0 regions
disabled 8 regions
zeroed 6 nmems
enabled 8 regions
{
  "dev":"namespace0.0",
  "mode":"devdax",
  "map":"dev",
  "size":29364224,
  "uuid":"0e376ac8-369d-42bc-a20f-4e58e0970a54",
  "raw_uuid":"55d81f22-2497-4f5b-b906-7ebe1529913f",
  "chardev":"dax0.0"
}
ldorau commented 6 years ago

I confirm that injecting bad blocks in the nfit_test module does not work with kernel v4.17.4 + ndctl v61.1. So this is an external bug. I will submit a bug report. Please use kernel v4.16 as a workaround for this bug.

zhijianli88 commented 6 years ago

Great, it works for me on v4.16

pmempool_create/TEST9: PASS
pmempool_create/TEST9: SETUP (check/non-pmem/static-debug)
pmempool_create/TEST9: PASS
pmempool_create/TEST9: SETUP (check/non-pmem/static-nondebug)
pmempool_create/TEST9: PASS
pmempool_create/TEST10: SETUP (check/non-pmem/debug)
pmempool_create/TEST10: PASS
pmempool_create/TEST10: SETUP (check/non-pmem/nondebug)
pmempool_create/TEST10: PASS
pmempool_create/TEST11: SETUP (check/non-pmem/debug)
pmempool_create/TEST11: PASS
pmempool_create/TEST11: SETUP (check/non-pmem/nondebug)
pmempool_create/TEST11: PASS
pmempool_create/TEST12: SETUP (check/non-pmem/debug)
pmempool_create/TEST12: PASS
pmempool_create/TEST12: SETUP (check/non-pmem/nondebug)
pmempool_create/TEST12: PASS 
djbw commented 6 years ago

Sorry about the thrash. We overhauled ARS handling between 4.16 and 4.17 and one of the casualties was "inject-error --notify" support in nfit_test. Some more details here .

For now, you need to run ndctl start-scrub; ndctl wait-scrub; after injecting errors on nfit_test to get them to appear in the badblocks. We're investigating how to restore "--notify" support, but it might not be implemented for one or more releases.

ldorau commented 6 years ago

@zhijianli88 The commit https://github.com/pmem/pmdk/commit/439d0d0ce0d646097e1a0664c855e5d4bc9f84a8 has been merged upstream. Please verify and close this issue if it is fixed.

zhijianli88 commented 6 years ago

verified