Open eloydegen opened 5 years ago
Eloy notifications@github.com writes:
I'm trying to get the Advanced03_AF_XDP running in Fedora, which runs kernel
5.3.0.rc6
.
CONFIG_XDP_SOCKETS=y
is correctly configured.I'm running the following commands as root:
cd advanced03-AF_XPD make t setup --name veth-adv03
The last command results in 100% packet loss when running
ping
. Same happens then fort ping
of course../af_xdp_user -d veth-adv03
This prints the following error:
ERROR: Can't create umem "Invalid argument"
Any clue how I can fix this?
Hmm, this seems different from the other permission errors we've seen. @chaudron, any idea what's up with this? :)
I've not seen this before. I assume you use the libbpf from the tutorial, if so can you try the one from your kernel/distribution?
Also, can you try to debug what is failing, as the libbpf API has several failure points, xsk_page_aligned()/mmap()/setsockopt(), etc. etc.
Oh I should note that this is the beta version of Fedora, but I would argue this is better than the current stable release (kernel 5.0) combined with the mainline kernel.
I Installed a new VM, the ping
now works but the second error persist.
I have installed libbpf-devel
and pointed the LIBBPF_DIR
variable in the /advanced03-AF_XDP/Makefile
to /usr/include/bpf
, but then it can't build. It does build fine in the default setup.
Creating a printf
statement at the top of the main function in af_xdp_user.c
doesn't show it, so I'm not sure how to debug this further.
libbpf-devel
package in Fedora does not include all the files that are currently in the /libbpf/src
folder, they come from the Linux kernel source. I pointed to Makefile
variable to the folder in the Linux source and compiling works. Running t ping
again results in 100% packet loss.
Eloy notifications@github.com writes:
libbpf-devel
package in Fedora does not include all the files that are currently in the/libbpf/src
folder, they come from the Linux kernel source.
The libbpf-devel package is supposed to contain everything. If it doens't, please file a bug (although I think there may be a new version of the libbpf package coming soon, so it may fix itself at that point).
I pointed to
Makefile
variable to the folder in the Linux source and compiling works. Runningt ping
again results in 100% packet loss.
Are you seeing any output from the af_xdp_user command? You're not actually supposed to get any ping replies while running the initial example...
You're not actually supposed to get any ping replies while running the initial example...
Oh. The first time I ran it on Ubuntu, the ping worked. Interesting.
The invalid argument is coming from an munmap
syscall, but I'm still clueless what the actual problem is. I have attached the strace log
Eloy notifications@github.com writes:
You're not actually supposed to get any ping replies while running the initial example...
Oh. The first time I ran it on Ubuntu, the ping worked. Interesting.
The invalid argument is coming from an
munmap
syscall, but I'm still clueless what the actual problem is. I have attached the strace log
No, I think it's coming from the preceding mmap:
mmap(NULL, 8374384, PROT_READ|PROT_WRITE, MAP_SHARED|MAP_POPULATE, 3, 0x180000000) = -1 EINVAL (Invalid argument)
The munmap is libbpf's attempt at cleaning up in the error path (which also fails for some reason).
Looking at the kernel code, I guess it's either failing this check:
if (size > (PAGE_SIZE << compound_order(qpg)))
return -EINVAL;
or remap_pfn_range() returns -EINVAL. Which can also happen, I guess, but not sure if it is in this case.
Hmm, maybe try making the buffer smaller? Just decrease NUM_FRAMES at the top of af_xdp_user.c and recompile...
No, I think it's coming from the preceding mmap:
Missed that one, it's the earliest error indeed.
Hmm, maybe try making the buffer smaller? Just decrease NUM_FRAMES at the top of af_xdp_user.c and recompile...
I decreased it to 64 from the original 4096, still the same error.
I compiled it with the Linux mainline source as well with the library code in this repository, that doesn't make a difference.
Eloy notifications@github.com writes:
No, I think it's coming from the preceding mmap:
Missed that one, it's the earliest error indeed.
Hmm, maybe try making the buffer smaller? Just decrease NUM_FRAMES at the top of af_xdp_user.c and recompile...
I decreased it to 64 from the original 4096, still the same error.
I compiled it with the Linux mainline source as well with the library code in this repository, that doesn't make a difference.
Hmm, right, that's odd. No idea what's failing now. I'll try to ping some of the upstream AF_XDP devs and point them here, let's see if they have any ideas...
Thanks! I just subscribed to xdp-newbies
and bpf
on the Linux Kernel mailinglist, so I hope you're sending it there.
You are likely running a too new libbpf on an older kernel. In 5.4-rcX, there is a new feature that changes the size of the offset struct. An old libbpf or app can run on any kernel, but a new libbpf cannot run on an old kernel. Something that should be supported? In the mean time, just use an older libbpf (from 5.3), or a newer kernel :-).
Actually, this should be fixed in libbpf. Will submit a patch. Thanks for detecting this.
Magnus Karlsson notifications@github.com writes:
You are likely running a too new libbpf on an older kernel. In 5.4-rcX, there is a new feature that changes the size of the offset struct. An old libbpf or app can run on any kernel, but a new libbpf cannot run on an old kernel. Something that should be supported? In the mean time, just use an older libbpf (from 5.3), or a newer kernel :-).
Wait, isn't libbpf supposed to be backwards-compatible with older kernels as well?
Magnus Karlsson notifications@github.com writes:
Actually, this should be fixed in libbpf. Will submit a patch. Thanks for detecting this.
Great, thanks!
Magnus Karlsson notifications@github.com writes: You are likely running a too new libbpf on an older kernel. In 5.4-rcX, there is a new feature that changes the size of the offset struct. An old libbpf or app can run on any kernel, but a new libbpf cannot run on an old kernel. Something that should be supported? In the mean time, just use an older libbpf (from 5.3), or a newer kernel :-). Wait, isn't libbpf supposed to be backwards-compatible with older kernels as well?
Do not know. I just thought about all the support tickets I would get if I do not fix this right now :-).
Magnus Karlsson notifications@github.com writes:
Do not know. I just thought about all the support tickets I would get if I do not fix this right now :-).
Hehe, right. Well, we're just going to keep reporting any compatibility issues to you so you also have to deal with those, then ;)
Has the patch been submitted already, so I can try to build it again? Or does it need more time?
On Mon, Sep 23, 2019 at 10:30 AM Eloy notifications@github.com wrote:
Has the patch been submitted already, so I can try to build it again? Or does it need more time?
It needs more time since I am travelling to Kernel Recipes this week. I will let you know as soon as it is finished.
/Magnus
— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/xdp-project/xdp-tutorial/issues/78?email_source=notifications&email_token=AASGUEJK6WOG3BVK6742GU3QLB5CBA5CNFSM4IYI6V52YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD7KENGA#issuecomment-534005400, or mute the thread https://github.com/notifications/unsubscribe-auth/AASGUEJDZNQQS4GFHIIVHETQLB5CBANCNFSM4IYI6V5Q .
Thanks for the quick response, I will await it.
Eloy,
Could you please provide me with your full name and mail address? I would like to give you credit on the patch with a Reported-by tag as you found this issue.
Yes, that is Eloy Degen degeneloy@gmail.com
Thanks for the fix and attribution.
Sent you a patch that it would be great if you could try out. Note that samples/bpf does not build at the moment in bpf/master, so I applied the patch to an old need_wakeup development branch, then launched a standard Linux 5.3 that does not have need_wakeup support. The sample/libbpf compiled with need_wakeup runs as expected on that kernel without the support.
I'm trying to get the Advanced03_AF_XDP running in Fedora, which runs kernel
5.3.0.rc6
.CONFIG_XDP_SOCKETS=y
is correctly configured.I'm running the following commands as root:
The last command results in 100% packet loss when running
ping
. Same happens then fort ping
of course.This prints the following error:
ERROR: Can't create umem "Invalid argument"
Any clue how I can fix this? I have also tried compiling it on the released VM, but that version did not include Advanced 3 and I'm not able to compile the new code I pulled. I would appreciate any pointer! :)