usnistgov / ndn-dpdk

NDN-DPDK: High-Speed Named Data Networking Forwarder
https://www.nist.gov/publications/ndn-dpdk-ndn-forwarding-100-gbps-commodity-hardware
Other
131 stars 26 forks source link

ndndpdk-depends.sh assumes/requires DPDK headers to be installed in `/usr/local/include` #23

Closed Pesa closed 1 year ago

Pesa commented 3 years ago

It probably shouldn't assume a specific filesystem location. Another common install path is /usr/include/dpdk. Maybe, in addition to checking in /usr/local, use pkg-config to detect things, since that's precisely its purpose?

yoursunny commented 3 years ago

Why would DPDK end up in /usr/include/dpdk? Their source installation does not put files there.

I'm aware that Ubuntu's libdpdk-dev package can put files there. However, the DPDK version provided in Ubuntu package is incompatible with NDN-DPDK, so this is a moot point.

The purpose of --dpdk=0 flag is to skip reinstalling DPDK after it has been installed by the same script.

Pesa commented 3 years ago

Why would DPDK end up in /usr/include/dpdk?

That's not important. It could be installed anywhere. The point is that the script shouldn't assume a specific location. However, see below (I may have interpreted the script's purpose differently from its intended use).

Their source installation does not put files there.

The include directory can be configured (that's generally true for all software).

I'm aware that Ubuntu's libdpdk-dev package can put files there. However, the DPDK version provided in Ubuntu package is incompatible with NDN-DPDK, so this is a moot point.

Debian experimental has v21.08. But as I said above, this is not the point.

The purpose of --dpdk=0 flag is to skip reinstalling DPDK after it has been installed by the same script.

Ok, you seem to have written the script as a "all or nothing" solution. It cannot be used to install only some dependencies and take the rest of them as-is (and for those installed by the user, it's the user's responsibility to ensure they're installed correctly, compatible version, etc.) If that's the case, it should be clearly stated in the documentation. I won't be happy with that, as it seems to needlessly complicate a valid use case, but it's your choice. Also, there's an easy solution using pkg-config (which, btw, is already being used to extract the CFLAGS).

yoursunny commented 3 years ago

I'm aware that Ubuntu's libdpdk-dev package can put files there. However, the DPDK version provided in Ubuntu package is incompatible with NDN-DPDK, so this is a moot point.

Debian experimental has v21.08. But as I said above, this is not the point.

Only Debian 11 stable is supported by the script. However, technically one could make a backport package from Debian experimental, so this is a valid scenario.

The purpose of --dpdk=0 flag is to skip reinstalling DPDK after it has been installed by the same script.

Ok, you seem to have written the script as a "all or nothing" solution. It cannot be used to install only some dependencies and take the rest of them as-is (and for those installed by the user, it's the user's responsibility to ensure they're installed correctly, compatible version, etc.) If that's the case, it should be clearly stated in the documentation. I won't be happy with that, as it seems to needlessly complicate a valid use case, but it's your choice. Also, there's an easy solution using pkg-config (which, btw, is already being used to extract the CFLAGS).

I could make --dpdk=0 option issue a warning instead of an error.

As you might have noticed, the new check for Linux kernel 5.4 can be bypassed by setting an environment variable. On the other hand, if --dpdk=0 is changed to a warning, it would not require a bypass environment variable, because the flag itself is enough.


A related issue is the filesystem path for shared libraries, which is currently hard-coded in package spdkenv: https://github.com/usnistgov/ndn-dpdk/blob/74f841d33b963da0d1f4178be079f64924d959f7/dpdk/spdkenv/init.go#L36-L41 These would have to be specified in environment variable, like this:

sudo tee /etc/systemd/system/ndndpdk-svc.service.d/libs.conf <<EOT
[Service]
Environment="NDNDPDK_DPDK_LIBS=/usr/local/lib" "NDNDPDK_SPDK_LIBS=/usr/local/lib"
EOT

Or, shall I be interpreting LD_LIBRARY_PATH?

Pesa commented 3 years ago

I could make --dpdk=0 option issue a warning instead of an error.

As you might have noticed, the new check for Linux kernel 5.4 can be bypassed by setting an environment variable. On the other hand, if --dpdk=0 is changed to a warning, it would not require a bypass environment variable, because the flag itself is enough.

Sure, either a warning or an env variable to bypass the check would work for me.

A related issue is the filesystem path for shared libraries, which is currently hard-coded in package spdkenv: https://github.com/usnistgov/ndn-dpdk/blob/74f841d33b963da0d1f4178be079f64924d959f7/dpdk/spdkenv/init.go#L36-L41

You should have the dynamic linker do the work for you :) If you pass just the library name to dlopen, the linker will search for the library as usual, including LD_LIBRARY_PATH and /etc/ld.so.cache. Is that not supported by Go's dlopen?

yoursunny commented 3 years ago

Go does not have dlopen. dlopen.Load function calls C.dlopen.

The problem is with /usr/local/lib/libspdk.so, which contains this text: (no newlines in actual file)

GROUP ( libspdk_accel.so libspdk_accel_ioat.so libspdk_bdev.so libspdk_bdev_aio.so libspdk_bdev_delay.so
libspdk_bdev_error.so libspdk_bdev_ftl.so libspdk_bdev_gpt.so libspdk_bdev_lvol.so libspdk_bdev_malloc.so
libspdk_bdev_null.so libspdk_bdev_nvme.so libspdk_bdev_passthru.so libspdk_bdev_raid.so libspdk_bdev_split.so
libspdk_bdev_virtio.so libspdk_bdev_zone_block.so libspdk_blob.so libspdk_blob_bdev.so libspdk_blobfs.so
libspdk_blobfs_bdev.so libspdk_conf.so libspdk_env_dpdk_rpc.so libspdk_event.so libspdk_event_accel.so
libspdk_event_bdev.so libspdk_event_iscsi.so libspdk_event_nbd.so libspdk_event_nvmf.so libspdk_event_scsi.so
libspdk_event_sock.so libspdk_event_vmd.so libspdk_ftl.so libspdk_init.so libspdk_ioat.so libspdk_iscsi.so
libspdk_json.so libspdk_jsonrpc.so libspdk_log.so libspdk_lvol.so libspdk_nbd.so libspdk_notify.so libspdk_nvme.so
libspdk_nvmf.so libspdk_rpc.so libspdk_scsi.so libspdk_sock.so libspdk_sock_posix.so libspdk_thread.so
libspdk_trace.so libspdk_util.so libspdk_virtio.so libspdk_vmd.so )

C.dlopen does not accept this file. C.dlerror() returns: /usr/local/lib/libspdk.so: invalid ELF header.

To solve this problem, I have dlopen.LoadGroup function that parses this text, then call C.dlopen on each dynamic library. It also needs to try different ordering until all drivers are loaded without missing dependency errors.

Pesa commented 3 years ago

Oh yeah, I remember the linker script. But I don't remember why you need to use dlopen instead of linking at compile time.

In any case, the linker script is not supposed to be used at runtime. Can you dlopen() the specific libraries that you need? The dynamic linker should take care of recursively loading their dependencies, assuming they're correctly specified in the shared object.

It also needs to try different ordering until all drivers are loaded without missing dependency errors.

This is worrisome and may indicate that the shared objects are not being built correctly. Dependency resolution should be automatic.

Pesa commented 3 years ago

This says

All SPDK libraries have their dependencies specified at object compile time.

yoursunny commented 3 years ago

Can you dlopen() the specific libraries that you need?

Some libraries are not directly referenced, but needed for their side-effects of registering drivers. For example, libspdk_bdev_aio.so registers bdev_aio_create RPC method for creating AIO block device, but it's not linked from NDN-DPDK service executable. Even if #cgo LDFLAGS: -lspdk_bdev_aio is specified, the Go linker will drop this dependency because none of the symbols is referenced.

It's easiest to load all of them.


All SPDK libraries have their dependencies specified at object compile time.

Not true. libspdk_event.so relies on symbols from librte_power.so but it does not have it as a dependency.

$ readelf --symbols /usr/local/lib/libspdk_event.so | grep rte_
     6: 0000000000000000     0 NOTYPE  GLOBAL DEFAULT  UND rte_power_get_capabilitie
    45: 0000000000000000     0 NOTYPE  GLOBAL DEFAULT  UND rte_power_freq_enable_tur
    66: 0000000000000000     0 NOTYPE  GLOBAL DEFAULT  UND rte_power_freqs
    69: 0000000000000000     0 NOTYPE  GLOBAL DEFAULT  UND rte_power_freq_down
    73: 0000000000000000     0 NOTYPE  GLOBAL DEFAULT  UND rte_power_freq_max
    80: 0000000000000000     0 NOTYPE  GLOBAL DEFAULT  UND rte_power_freq_min
   124: 0000000000000000     0 NOTYPE  GLOBAL DEFAULT  UND rte_power_freq_up
   126: 0000000000000000     0 NOTYPE  GLOBAL DEFAULT  UND rte_power_init
   137: 0000000000000000     0 NOTYPE  GLOBAL DEFAULT  UND rte_power_exit
   148: 0000000000000000     0 NOTYPE  GLOBAL DEFAULT  UND rte_power_get_freq
   341: 0000000000000000     0 NOTYPE  GLOBAL DEFAULT  UND rte_power_get_capabilitie
   384: 0000000000000000     0 NOTYPE  GLOBAL DEFAULT  UND rte_power_freq_enable_tur
   408: 0000000000000000     0 NOTYPE  GLOBAL DEFAULT  UND rte_power_freqs
   411: 0000000000000000     0 NOTYPE  GLOBAL DEFAULT  UND rte_power_freq_down
   415: 0000000000000000     0 NOTYPE  GLOBAL DEFAULT  UND rte_power_freq_max
   423: 0000000000000000     0 NOTYPE  GLOBAL DEFAULT  UND rte_power_freq_min
   471: 0000000000000000     0 NOTYPE  GLOBAL DEFAULT  UND rte_power_freq_up
   473: 0000000000000000     0 NOTYPE  GLOBAL DEFAULT  UND rte_power_init
   485: 0000000000000000     0 NOTYPE  GLOBAL DEFAULT  UND rte_power_exit
   497: 0000000000000000     0 NOTYPE  GLOBAL DEFAULT  UND rte_power_get_freq
$ ldd /usr/local/lib/libspdk_event.so
        linux-vdso.so.1 (0x00007ffe827a9000)
        libspdk_log.so.4.0 => //usr/local/lib/libspdk_log.so.4.0 (0x00007f71dc395000)
        libspdk_util.so.3.1 => //usr/local/lib/libspdk_util.so.3.1 (0x00007f71dc183000)
        libspdk_thread.so.6.0 => //usr/local/lib/libspdk_thread.so.6.0 (0x00007f71dbf71000)
        libspdk_json.so.3.1 => //usr/local/lib/libspdk_json.so.3.1 (0x00007f71dbd67000)
        libspdk_jsonrpc.so.3.0 => //usr/local/lib/libspdk_jsonrpc.so.3.0 (0x00007f71dbb5f000)
        libspdk_rpc.so.3.0 => //usr/local/lib/libspdk_rpc.so.3.0 (0x00007f71db95b000)
        libspdk_trace.so.4.0 => //usr/local/lib/libspdk_trace.so.4.0 (0x00007f71db755000)
        libspdk_init.so.1.0 => //usr/local/lib/libspdk_init.so.1.0 (0x00007f71db54e000)
        libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007f71db15d000)
        libuuid.so.1 => /lib/x86_64-linux-gnu/libuuid.so.1 (0x00007f71daf56000)
        /lib64/ld-linux-x86-64.so.2 (0x00007f71dc7ae000)
        librt.so.1 => /lib/x86_64-linux-gnu/librt.so.1 (0x00007f71dad4e000)
        libpthread.so.0 => /lib/x86_64-linux-gnu/libpthread.so.0 (0x00007f71dab2f000)

It also needs to try different ordering until all drivers are loaded without missing dependency errors.

This is worrisome and may indicate that the shared objects are not being built correctly. Dependency resolution should be automatic.

On the other hand, apart from libspdk_event.so => librte_power.so, there's no other missing dependencies. Therefore, I can eliminate the loop-retry logic in dlopen.LoadGroup.

Pesa commented 3 years ago

Even if #cgo LDFLAGS: -lspdk_bdev_aio is specified, the Go linker will drop this dependency because none of the symbols is referenced.

Yeah I think you need to link with -Wl,--no-as-needed.

All SPDK libraries have their dependencies specified at object compile time.

Not true. libspdk_event.so relies on symbols from librte_power.so but it does not have it as a dependency.

Sounds like an upstream bug.

yoursunny commented 1 year ago

SPDK path assumption is relaxed in c0283b82720f93a394eedd4244949118e9f3da79