tcp-acceleration-service / tas

TAS is a drop-in highly CPU efficient and scalable TCP acceleration service.
https://tcp-acceleration-service.github.io/
Other
82 stars 43 forks source link

Correct dpdk drivers with Mellanox cards #7

Closed vsag96 closed 4 years ago

vsag96 commented 4 years ago

I have built dpdk and I can see the ports with testpmd. The documentations tells us to use, vfio-pci for intel cards. With Mellanox cards I get a bind error for vfio-pci. I tried to bind it to igb_uio, TAS is not able to find the Ethernet cards. I get the following error.

No ethernet devices
network init failed

Sometimes I get the following error. I searched the repo for this output string and I found the place it occurs. However what would be the main reason on why this happens?

util_create_shmsiszed: mmap failed: Cannot allocate memory
mapping flexnic dma memory failed

Note: OFED Version: MLNX_OFED_LINUX-5.0-2.1.8.0 DPDK Version:19.11.2 Thanks Vineeth

FreakyPenguin commented 4 years ago

The examples in the README are specific for Intel and virtio NICs. But there is nothing tas-specific about how to setup the dpdk drivers. As long as a valid dpdk ethernet device appears in the end with the right features for the specified config flags, TAS is happy. So refer to the respective dpdk PMD documentation.

mlx4/5 in particular are a bit different because the in-kernel mlx driver already has support for kernel-bypass, so you do not need to bind the NIC to a separate bypass driver (see here https://doc.dpdk.org/guides/nics/mlx5.html#usage-example). I think the only different option I pass to TAS in our mellanox testbed is --dpdk-extra="-w 0000:3b:00.1" (see https://doc.dpdk.org/guides/linux_gsg/linux_eal_parameters.html#device-related-options) because our machines have multiple mellanox NICs, so I only whitelist the one I want to use. If you only have the one then not even that should be necessary. The only issue I remember with slightly older dpdk/mellanox driver versions was that the mlx5 dpdk sometimes segfaulted when interrupts were turned on and off frequently. If you see that try running with interrupts disabled (--fp-no-ints to TAS).

With regards to the other error: this probably has to do with how dpdk handles huge pages. When a dpdk app starts up the dpdk EAL grabs all the huge pages and creates named files for them, after that there are no more free huge pages. Now by default (unless started with --fp-no-hugepages), TAS also tries to allocate a few huge pages for the shared memory between fast path and app. TAS itself does this before initializing the dpdk eal, so it can grab enough huge pages. But if you previously run another dpdk app all huge pages will have been allocated and "named", which persists after the app terminates. So just manuall remove the rtemap* and tas* files in /dev/hugepages (or /mnt/hugepages), afterwards /proc/meminfo should show the hugepages as free again. If you also have transparent hugepages turned on in the kernel, it's possible that linux also starts using up huge pages for other allocations over time. In this case either allocate more, or disable THP.

Hope this helps. Antoine

vsag96 commented 4 years ago

Hi @FreakyPenguin ,

Thanks for the concise reply. It really helped me. I tried removing the files from /dev/hugepages/, it went from error 2 to 1 again, saying no ethernet devices found. testpmd is able to recognize the ports. The device I tried to use was active i.e an IP is assigned to it, I tried it with a device with no IP associated with it, still the same problem. For both these cases I tried to pass the -w option with --dpdk-extra. Still no luck.

Moreover, your comments on devbind really helped me and showed an error on my part. Any suggestions on how I can start with dpdk, given that it is so huge.

Thanks Vineeth

FreakyPenguin commented 4 years ago

Ah appologies, forgot about an important part. The TAS makefile links in individual pmds, and since mlx4/5 are not built by default and require external dependencies we don't include that by default. For mlx5 try the following:

DPDK_PMDS= mlx5
EXTRA_LIBS_DPDK= -libverbs -lmlx5

Either on the commandline with make, or create a Makefile.local file and stick it in there.

vsag96 commented 4 years ago

Now it works. @FreakyPenguin . If you want I can make a PR for this. I'll be closing the issue. If we have to implement extra options and other POSIX socket api primitves, what would be the right place to start? I'll try reading through the code provided and If I have further doubts about implementation details on extra socket options, I'll post here.

FreakyPenguin commented 4 years ago

If you want to make a PR for the readme or the documentation (in docs subfolder, goes up on readthedocs automatically) that would be great. The mlx5 pmd is not included by default on purpose, because otherwise it won't compile on systems without the mellanox libraries anymore.

For extending the POSIX api, all you have to look at is the sockets library, in lib/sockets. If you look at the git log there you should see examples for calls or flags that were added recently. The sockets library emulates sockets on top of the TAS low-level interface in lib/tas. There should not be any other sockets-specific things elsewhere in the code.