Xilinx-CNS / onload

OpenOnload high performance user-level network stack
Other
562 stars 90 forks source link

ERROR: failed to find syscall table on Fedora 39 (kernel 6.8) #216

Closed w-kudla closed 5 months ago

w-kudla commented 5 months ago

The whole build passes (master, 88527b4743a01eae14be721da6b252a53c304de2) but then onload module will fail to load because sfc_resource cannot init:

[152798.664550] [sfc efrm] init_sfc_resource: ERROR: failed to find syscall table

I checked how the code gets the syscall table and we should be in the simplest case:

void** find_syscall_table(void)
{
  unsigned char *p = NULL;
  unsigned long result;
  unsigned char *pend;

  /* First see if it is in kallsyms */
#ifdef EFRM_HAVE_NEW_KALLSYMS
  /* It works with CONFIG_KALLSYMS_ALL=y only. */
  p = efrm_find_ksym("sys_call_table");
#endif
  if( p != NULL ) {
    TRAMP_DEBUG("syscall table ksym at %px", (unsigned long*)p);
    return (void**)p;
  }

because this kernel (6.8.6-200.fc39.x86_64) is built with all symbols exported to kallsyms:

# grep KALLSYMS /boot/config-$(uname -r)
CONFIG_KALLSYMS=y
# CONFIG_KALLSYMS_SELFTEST is not set
CONFIG_KALLSYMS_ALL=y
CONFIG_KALLSYMS_ABSOLUTE_PERCPU=y
CONFIG_KALLSYMS_BASE_RELATIVE=y

The symbol is present in kallsyms:

# grep sys_call_table /proc/kallsyms 
ffffffff82401800 D sys_call_table

What am I missing?

ol-alexandra commented 5 months ago

It works with CONFIG_KALLSYMS_ALL=y only. comment is misleading. All the Onload kallsyms machinery works for linux <= 5.6, where kallsyms_on_each_symbol() symbol is exported.

w-kudla commented 5 months ago

@ol-alexandra Not sure if I understand. Do you mean that kernels 6.x are not supported at all?

ol-alexandra commented 5 months ago

I mean that linux-6.x are supported in a more complicated way, not via kallsyms. EFRM_HAVE_NEW_KALLSYMS is undefined for contemporary kernels. And "more complicated way" is very fragile. Some variant of linux-6.8 probably works, see #211 from @okt-sergeyn . Your kernel is probably not supported, or your kernel have Onload issues when running on your hardware (yes, such thing happens). model name & flags lines from /proc/cpuinfo may help to understand the issue.

ivatet-amd commented 5 months ago

I think @tcrawley-xilinx has a plan to fix it. A similar issue was reported as the latest comment in https://github.com/Xilinx-CNS/onload/issues/164.

okt-sergeyn commented 5 months ago

JFYI: it works well with Fedora 39 6.8.4-200.fc39.x86_64

w-kudla commented 5 months ago

model name & flags lines from /proc/cpuinfo may help to understand the issue.

model name  : Intel(R) Xeon(R) CPU Max 9468
flags       : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc art arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc cpuid aperfmperf tsc_known_freq pni pclmulqdq dtes64 monitor ds_cpl smx est tm2 ssse3 sdbg cx16 xtpr pdcm pcid dca sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes f16c rdrand lahf_lm abm 3dnowprefetch cpuid_fault epb cat_l3 cat_l2 cdp_l3 invpcid_single intel_ppin cdp_l2 ssbd mba ibrs ibpb stibp ibrs_enhanced fsgsbase tsc_adjust bmi1 hle smep bmi2 erms invpcid rtm cqm rdt_a rdseed adx smap clflushopt clwb intel_pt sha_ni cqm_llc cqm_occup_llc cqm_mbm_total cqm_mbm_local split_lock_detect avx_vnni wbnoinvd dtherm ida arat pln pts hfi umip waitpkg tme la57 rdpid bus_lock_detect cldemote movdiri movdir64b fsrm md_clear serialize tsxldtrk pconfig arch_lbr amx_bf16 amx_int8 flush_l1d arch_capabilities

@ol-alexandra We don't have any issues with this HW on RHEL 8.x and RHEL 9.x but those run kernels 3.10 and 4.18 respectively

w-kudla commented 5 months ago

JFYI: it works well with Fedora 39 6.8.4-200.fc39.x86_64

Thanks, downgrading to 6.5.6-300.fc39.x86_64 also helped. Can you please explain why the elegant kallsyms solution stopped working for the newer kernels? I saw some convoluted way of determining syscall table offset in the sources that works by examining instructions/text directly which I agree is dodgy.

ol-alexandra commented 5 months ago

Linux kernel stopped from exporting kallsyms_on_each_symbol() and similar function. These functions were used to pull GPL-only symbols to proprietary modules, so Linux authors had their strong reason. You can see some part of this story in Onload history under ON-12093 starting from 62e19f02628957be523229b0d454d440849c2c29

w-kudla commented 5 months ago

These functions were used to pull GPL-only symbols to proprietary modules, so Linux authors had their strong reason.

I don't think this is a good enough reason and as much as it is disappointing, sadly it's not unexpected from Linux authors. There has been a widespread trend of pulling the rug from under out-of-tree stuff or userspace without offering any workarounds. Same is happening with wrmsr or disabling of cli/sti instructions. What is the plan for onload? Rely on the nasty heuristics in find_syscall_table?

jfeather-amd commented 5 months ago

Hi @w-kudla, we've done some work to improve our ability to find the syscall table for v6.9 kernels in the following commits:

These have been tested mainly on Debian (12) and Ubuntu (22.04/23.10). We are yet to test it on Fedora 39, but these changes are available on the master branch so you can try them out early if you'd like!

okt-sergeyn commented 5 months ago

Hi @w-kudla, we've done some work to improve our ability to find the syscall table for v6.9 kernels in the following commits:

* [6abf274](https://github.com/Xilinx-CNS/onload/commit/6abf27413b23a8f6e6f3b985271eb1712f5cc562) ("ON-15692: Use x64_sys_call function when syscall_table isn't available")

* [d04f55b](https://github.com/Xilinx-CNS/onload/commit/d04f55ba3460130b00d08e41902656c33603cf7c) ("ON-15742: match the new CONFIG_RETPOLINE option name for v6.9+ kernels")

* [a9d697b](https://github.com/Xilinx-CNS/onload/commit/a9d697b42d3563ce53787159f420c0ae1138a062) ("ON-15692: Catch more calls to x64_sys_call")

These have been tested mainly on Debian (12) and Ubuntu (22.04/23.10). We are yet to test it on Fedora 39, but these changes are available on the master branch so you can try them out early if you'd like!

Thank you guys for the fix.

I've tried the latest master with the commits above on Centos9 6.8.7-1.el9.elrepo.x86_64 and it still happens to fail with the error: [sfc efrm] init_sfc_resource: ERROR: failed to find syscall table

ech68 commented 5 months ago

Same problem has cropped up with the 6.1.85 and newer 6.1 series kernel.org kernels - 6.1.84 was fine.

abower-amd commented 5 months ago

@ech68 you should find adae58dd79cd5c47bed0958a664f6cf9f333e77e fixes additional cases, hopefully yours included.

ech68 commented 5 months ago

That did indeed work - applying this change along with the other 3 patches mentioned earlier in this thread against the latest 8.1.2.26 openonload package and building for the 6.1.85+ kernels.

It fails to compile when built against the current stock EL8 kernel-devel, however, breaking at line 271 in 6abf274

abower-amd commented 5 months ago

@ech68 Did you try using the tip of the master branch? If that works then you may be missing other compatibility changes required for the given kernel.

ech68 commented 5 months ago

If I apply the first part of commit deaaa8d, then it compiles. (using the tip of master also works)

abower-amd commented 5 months ago

Closing as I believe there are no known issues with this now.