intel / idxd-config

Accel-config / libaccel-config
Other
56 stars 35 forks source link

Error running 'accel-config test' and dsa test #54

Open LRL52 opened 4 months ago

LRL52 commented 4 months ago

When I complete building and installing idxd, trying to run accel-config-test,

# accel-config --version
4.1.5.git5b1b1235
# id
uid=0(root) gid=0(root) groups=0(root)
# uname -a
Linux iZbp14z8p4mu9ezo2ghxz5Z 5.10.134-16.1.al8.x86_64 #1 SMP Thu Dec 7 14:11:24 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux
# accel-config test
run test_libaccfg
__accfg_test_skip: explicit skip test_libaccfg:918
device is active, skipping tests
test-libaccfg: SKIP
libaccfg: accfg_unref: context 0x94e2a0 released

If I try to configrure dsa following the Intel® Data Streaming Accelerator User Guide,

# cat contrib/configs/app_profile.conf
[
  {
    "dev":"dsa0",
    "read_buffer_limit":0,
    "groups":[
      {
        "dev":"group0.0",
        "grouped_workqueues":[
          {
            "dev":"wq0.0",
            "mode":"shared",
            "size":8,
            "group_id":0,
            "priority":10,
            "block_on_fault":0,
            "max_batch_size":32,
            "max_transfer_size":16384,
            "type":"user",
            "driver_name":"user",
            "name":"app1",
            "threshold":6
          }
        ],
        "grouped_engines":[
          {
            "dev":"engine0.0",
            "group_id":0
          }
        ]
      },
      {
        "dev":"group0.1",
        "grouped_workqueues":[
          {
            "dev":"wq0.1",
            "mode":"shared",
            "size":32,
            "group_id":1,
            "priority":10,
            "block_on_fault":0,
            "max_batch_size":32,
            "max_transfer_size":2097152,
            "type":"user",
            "driver_name":"user",
            "name":"app2",
            "threshold":28
          }
        ],
        "grouped_engines":[
          {
            "dev":"engine0.1",
            "group_id":1
          }
        ]
      }
    ]
  }
]
# accel-config load-config -c contrib/configs/app_profile.conf -e
dsa0 is active. Skipping...
# accel-config list
[
  {
    "dev":"dsa0",
    "read_buffer_limit":0,
    "max_groups":4,
    "max_work_queues":8,
    "max_engines":4,
    "work_queue_size":128,
    "numa_node":0,
    "gen_cap":"0x40915f0107",
    "version":"0x100",
    "state":"enabled",
    "max_read_buffers":96,
    "max_batch_size":1024,
    "max_transfer_size":2147483648,
    "configurable":1,
    "pasid_enabled":0,
    "cdev_major":237,
    "clients":0,
    "groups":[
      {
        "dev":"group0.0",
        "read_buffers_reserved":0,
        "use_read_buffer_limit":0,
        "read_buffers_allowed":0,
        "grouped_workqueues":[
          {
            "dev":"wq0.0",
            "mode":"dedicated",
            "size":128,
            "group_id":0,
            "priority":10,
            "block_on_fault":0,
            "max_batch_size":32,
            "max_transfer_size":2097152,
            "cdev_minor":0,
            "type":"user",
            "name":"app0",
            "driver_name":"user",
            "threshold":0,
            "ats_disable":0,
            "state":"enabled",
            "clients":0
          }
        ],
        "grouped_engines":[
          {
            "dev":"engine0.0",
            "group_id":0
          },
          {
            "dev":"engine0.1",
            "group_id":0
          },
          {
            "dev":"engine0.2",
            "group_id":0
          },
          {
            "dev":"engine0.3",
            "group_id":0
          }
        ]
      },
      {
        "dev":"group0.1",
        "read_buffers_reserved":0,
        "use_read_buffer_limit":0,
        "read_buffers_allowed":0
      },
      {
        "dev":"group0.2",
        "read_buffers_reserved":0,
        "use_read_buffer_limit":0,
        "read_buffers_allowed":0
      },
      {
        "dev":"group0.3",
        "read_buffers_reserved":0,
        "use_read_buffer_limit":0,
        "read_buffers_allowed":0
      }
    ]
  }
]

# ls -la /dev/dsa/wq0.0 /dev/dsa/wq0.1
ls: cannot access '/dev/dsa/wq0.1': No such file or directory
crw------- 1 root root 237, 0 Feb 28 21:41 /dev/dsa/wq0.0

If I try to run intel/dsa_perf_micros,

# cd ~/dsa_perf_micros/
# ./scripts/setup_dsa.sh -d dsa0 -w 1 -m d -e 4
disabled dsa0/wq0.0
disabled dsa0
enabled 1 device(s) out of 1
enabled 1 wq(s) out of 1
# ./src/dsa_perf_micros -n128 -s4k -j -c -f -i1000 -k5 -w0 -zF,F -o3
./src/dsa_perf_micros -n128 -s4k -j -c -f -i1000 -k5 -w0 -zF,F -o3
-j option is deprecated (default behavior)
blen                       4096
bstride                    4096
bstride                    4096
nb_bufs                     128
pg_size                       0
wq_type                       0
batch_sz                      1
iter                       1000
nb_cpus                       1
var_mmio                      1
dma                           1
verify                        1
misc_flags                    0
access_op[0]               Write
access_op[1]               Write
place_op[0]              Memory
place_op[1]              Memory
flags_cmask            ffffffff
flags_smask                   0
flags_nth_desc                1
nb_numa_node                  2
cpu_desc_work                 0
Memory affinity
CPUs in node 0:         -1 -1
Buffer Offsets          0 0

Message from syslogd@iZbp14z8p4mu9ezo2ghxz5Z at Feb 28 21:53:58 ...
 kernel:Uhhuh. NMI received for unknown reason 20 on CPU 0.

Message from syslogd@iZbp14z8p4mu9ezo2ghxz5Z at Feb 28 21:53:58 ...
 kernel:Do you have a strange power saving mode enabled?

Message from syslogd@iZbp14z8p4mu9ezo2ghxz5Z at Feb 28 21:53:58 ...
 kernel:Dazed and confused, but trying to continue
dsa_perf_micros: poll_comp_common: timed out
dsa_perf_micros: check_comp: desc[0] timed out
desc addr: 0x7f39d4efa000
desc[0]: 0x0300010c00000000
desc[1]: 0x00007f39d4efc000
desc[2]: 0x00007f39d4efe000
desc[3]: 0x00007f39d4f7e000
desc[4]: 0x0000000000001000
desc[5]: 0x0000000000000000
desc[6]: 0x0000000000000000
desc[7]: 0x0000000000000000
dsa_perf_micros: main: test run failed
# accel-config list
[
  {
    "dev":"dsa0",
    "read_buffer_limit":0,
    "max_groups":4,
    "max_work_queues":8,
    "max_engines":4,
    "work_queue_size":128,
    "numa_node":0,
    "gen_cap":"0x40915f0107",
    "version":"0x100",
    "state":"enabled",
    "max_read_buffers":96,
    "max_batch_size":1024,
    "max_transfer_size":2147483648,
    "configurable":1,
    "pasid_enabled":0,
    "cdev_major":237,
    "clients":0,
    "groups":[
      {
        "dev":"group0.0",
        "read_buffers_reserved":0,
        "use_read_buffer_limit":0,
        "read_buffers_allowed":0,
        "grouped_workqueues":[
          {
            "dev":"wq0.0",
            "mode":"dedicated",
            "size":128,
            "group_id":0,
            "priority":10,
            "block_on_fault":0,
            "max_batch_size":32,
            "max_transfer_size":2097152,
            "cdev_minor":0,
            "type":"user",
            "name":"app0",
            "driver_name":"user",
            "threshold":0,
            "ats_disable":0,
            "state":"enabled",
            "clients":0
          }
        ],
        "grouped_engines":[
          {
            "dev":"engine0.0",
            "group_id":0
          },
          {
            "dev":"engine0.1",
            "group_id":0
          },
          {
            "dev":"engine0.2",
            "group_id":0
          },
          {
            "dev":"engine0.3",
            "group_id":0
          }
        ]
      },
      {
        "dev":"group0.1",
        "read_buffers_reserved":0,
        "use_read_buffer_limit":0,
        "read_buffers_allowed":0
      },
      {
        "dev":"group0.2",
        "read_buffers_reserved":0,
        "use_read_buffer_limit":0,
        "read_buffers_allowed":0
      },
      {
        "dev":"group0.3",
        "read_buffers_reserved":0,
        "use_read_buffer_limit":0,
        "read_buffers_allowed":0
      }
    ]
  }
]

I have looked up all issues and have not found anyone with the same problem, could anyone help me?

LRL52 commented 4 months ago

If I try to install the 3.5.0 version of accel-config, the accel-config test results are as follows:

# accel-config --version
3.5.0+
# accel-config list
[
]
# accel-config test
run test_libaccfg

Running accfg-test0: set and get configurations for shared wqs
configuring device dsa0
configuring group group0.0
configuring group group0.1
configuring wq wq0.0
libaccfg: accfg_wq_set_str_mode: wq0.0: write failed: Invalid argument
config wq failed
shared wq support not available
accfg-test0 *skipped*: required feature not found

Running accfg-test1: set and get configurations for dedicated wqs
configuring device dsa0
configuring group group0.0
configuring group group0.1
configuring wq wq0.1
libaccfg: accfg_wq_set_ats_disable: wq0.1: ats_disable attribute write failed: Operation not supported
configuring wq wq0.3
libaccfg: accfg_wq_set_ats_disable: wq0.3: ats_disable attribute write failed: Operation not supported
configuring engine engine0.0
configuring engine engine0.1
configuring engine engine0.2
configuring engine engine0.3
check device dsa0
check group group0.0
check group group0.1
check wq wq0.1
check wq wq0.3
check engine engine0.0
check engine engine0.1
check engine engine0.2
check engine engine0.3
accfg-test1 passed!

Running accfg-test2: max wq size
configuring group group0.0
configuring wq wq0.1
libaccfg: accfg_wq_set_ats_disable: wq0.1: ats_disable attribute write failed: Operation not supported
configuring wq wq0.3
libaccfg: accfg_wq_set_ats_disable: wq0.3: ats_disable attribute write failed: Operation not supported
trying to set wq size exceeding max wq size
libaccfg: accfg_wq_set_size: wq0.3: size attribute write failed: Invalid argument
wq size exceeding max wq size was not accepted
accfg-test2 passed!

Running accfg-test3: wq boundary conditions
configure device dsa0, group group0.0, wq wq0.1 for bounds test
libaccfg: accfg_wq_set_ats_disable: wq0.1: ats_disable attribute write failed: Operation not supported
trying to set wq max_batch_size = 0
libaccfg: accfg_wq_set_max_batch_size: wq0.1: max_batch_size attribute write failed: Invalid argument
trying to set wq max_transfer_size = 0
libaccfg: accfg_wq_set_max_transfer_size: wq0.1: write failed: Invalid argument
trying to set wq max_batch_size exceeding device max
libaccfg: accfg_wq_set_max_batch_size: wq0.1: max_batch_size attribute write failed: Invalid argument
trying to set wq max_transfer_size exceeding device max
libaccfg: accfg_wq_set_max_transfer_size: wq0.1: write failed: Invalid argument
0 and greater than device max values were not accepted
accfg-test3 passed!

accfg-test4 *disabled*

accfg-test5 *disabled*
test-libaccfg: PASS
SUCCESS!
libaccfg: accfg_unref: context 0x23d52a0 released