pmem / ndctl

A "device memory" enabling project encompassing tools and libraries for CXL, NVDIMMs, DAX, memory tiering and other platform memory device topics.
Other
270 stars 139 forks source link

Illegal instruction when I want to numactl --cpubind=0 --membind=1 to CXL Memory #249

Open Yemaoxin opened 1 year ago

Yemaoxin commented 1 year ago

I ran QEMU for simulating CXL DRAM and when I tried to run numactl --cpubind=0 --membind=1 ./test I got Illegal instruction Here is my config. QEMU:8.0.50 Linux kernel:6.3.7 This is the NUMA -H. NUMA node1 is a cxl memory created via daxctl. image Is it due to a lack of kernel compilation options?

sscargal commented 1 year ago

Q) What is your ./test program doing? That's likely where the Illegal instruction originates. Q) Do you see the error without using numactl? Q) Do you see any errors or useful information in dmesg? Q) Is there an application or Kernel crash/stack?

Yemaoxin commented 1 year ago

There is no problem in my test program. Even I change this to Linux ls command,Illegal instruction still . Actually, there was an error in dmesg.

root@8003:~# dmesg |grep error [ 1.718989] platform regulatory.0: Direct firmware load for regulatory.db failed with error -2

Another problem was when I run daxctl cmmand , I got a Failed

root@8003:~# daxctl reconfigure-device dax0.0 --mode=system-ram dax0.0 was already in system-ram mode [ 538.292896] Fallback order for Node 0: 0 1 [ 538.292908] Fallback order for Node 1: 1 0 [ 538.292912] Built 2 zonelists, mobility grouping on. Total pages: 8214577 [ 538.298909] Policy zone: Normal libdaxctl: memblock_find_zone: dax0.0: Failed to read /sys/devices/system/node/node1/memory753/valid_zones: No such file or directory [ { "chardev":"dax0.0", "size":34359738368, "target_node":1, "align":2097152, "mode":"system-ram", "online_memblocks":256, "total_memblocks":256 } ] reconfigured 1 device

When I run daxctl cmmand, I could see NUMA node 1 in numactl -H

root@8003:~# numactl -H available: 2 nodes (0-1) node 0 cpus: 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 node 0 size: 32090 MB node 0 free: 31325 MB node 1 cpus: node 1 size: 32768 MB node 1 free: 32768 MB node distances: node 0 1 0: 10 20 1: 20 10

Since my CPU is a Xeon III, not a Xeon IV with CXL support, I'm wondering if it's because the CPU doesn't support CXL instructions, or if the Xeon III can emulate it, just because my settings don't make sense

Yemaoxin commented 1 year ago

Even the ls command , it failed

root@8003:~# numactl --membind=1 ls [ 913.975032] traps: ls[667] trap invalid opcode ip:7fdec255d180 sp:7ffd3c507288 error:0 in ld-linux-x86-64.so.2[7fdec2546000+2a000] Illegal instruction

When I membind on node 0, no failed

root@8003:~# numactl --membind=0 ls ndctl

sscargal commented 1 year ago

Thanks for the update. The ls example is very useful. You should report this issue to the QEMU Developer Community to see if this is a bug or not.

Yemaoxin commented 1 year ago

Thanks, I will close this issue when I update my cpu to Xeon IV

zhijianli88 commented 1 year ago

I hit the same issue with my QEMU environment.