aabc / ipt-netflow

Netflow iptables module for Linux kernel (official)
https://github.com/aabc/ipt-netflow
503 stars 129 forks source link

ipt_NETFLOW no more compiles with Linux Kernel 4.19.x > 4.19.191 (as just updated in Debian Stable) #177

Closed xtaran closed 3 years ago

xtaran commented 3 years ago

(Note: This might sound similar to #176, but the reason for compile failures seem completely different.)

Debian has updated their kernel in Debian 10 Buster (currently "stable") from 4.19.181 to 4.19.194 and since then, both ipt_NETFLOW 2.3 (as shipped with Debian 10 Buster) as well as ipt_NETFLOW 2.5.1 (as shipped with the upcoming Debian 11) both no more compile against the most recent 4.19.x kernel in Debian 10.

It errors out as follows with ipt_NETFLOW 2.5.1 on x86_64 aka amd64:

[some unrelated warnings]
In file included from /var/lib/dkms/ipt-netflow/2.5.1/build/ipt_NETFLOW.c:76:
/var/lib/dkms/ipt-netflow/2.5.1/build/ipt_NETFLOW.c: In function ‘register_ct_events’:
/var/lib/dkms/ipt-netflow/2.5.1/build/compat.h:173:21: error: implicit declaration of function ‘ref_module’; did you mean ‘use_module’? [-Werror=implicit-function-declaration]
 # define use_module ref_module
                     ^~~~~~~~~~
/var/lib/dkms/ipt-netflow/2.5.1/build/ipt_NETFLOW.c:5480:3: note: in expansion of macro ‘use_module’
   use_module(THIS_MODULE, netlink_m);
   ^~~~~~~~~~
cc1: some warnings being treated as errors
make[4]: *** [/usr/src/linux-headers-4.19.0-17-common/scripts/Makefile.build:315: /var/lib/dkms/ipt-netflow/2.5.1/build/ipt_NETFLOW.o] Error 1
make[3]: *** [/usr/src/linux-headers-4.19.0-17-common/Makefile:1562: _module_/var/lib/dkms/ipt-netflow/2.5.1/build] Error 2
make[2]: *** [Makefile:146: sub-make] Error 2
make[1]: *** [Makefile:8: all] Error 2
make[1]: Leaving directory '/usr/src/linux-headers-4.19.0-17-amd64'

And with ipt_NETFLOW 2.3 on i686 (aka i386 on Debian for historic reasons; initially detected here):

In file included from /usr/src/linux-headers-4.19.0-17-common/include/linux/printk.h:7,
                 from /usr/src/linux-headers-4.19.0-17-common/include/linux/kernel.h:14,
                 from /usr/src/linux-headers-4.19.0-17-common/include/linux/list.h:9,
                 from /usr/src/linux-headers-4.19.0-17-common/include/linux/module.h:9,
                 from /var/lib/dkms/ipt-netflow/2.3/build/ipt_NETFLOW.c:21:
/var/lib/dkms/ipt-netflow/2.3/build/ipt_NETFLOW.c: In function ‘ipt_netflow_init’:
/usr/src/linux-headers-4.19.0-17-common/include/linux/kern_levels.h:5:18: warning: format ‘%lu’ expects argument of type ‘long unsigned int’, but argument 3 has type ‘unsigned int’ [-Wformat=]
 #define KERN_SOH "\001"  /* ASCII Start Of Header */
                  ^~~~~~
/usr/src/linux-headers-4.19.0-17-common/include/linux/kern_levels.h:14:19: note: in expansion of macro ‘KERN_SOH’
 #define KERN_INFO KERN_SOH "6" /* informational */
                   ^~~~~~~~
/var/lib/dkms/ipt-netflow/2.3/build/ipt_NETFLOW.c:5518:9: note: in expansion of macro ‘KERN_INFO’
  printk(KERN_INFO "ipt_NETFLOW: hashsize %u (%luK)\n", hashsize,
         ^~~~~~~~~
/var/lib/dkms/ipt-netflow/2.3/build/ipt_NETFLOW.c:5518:48: note: format string is defined here
  printk(KERN_INFO "ipt_NETFLOW: hashsize %u (%luK)\n", hashsize,
                                              ~~^
                                              %u
cc1: some warnings being treated as errors
make[4]: *** [/usr/src/linux-headers-4.19.0-17-common/scripts/Makefile.build:315: /var/lib/dkms/ipt-netflow/2.3/build/ipt_NETFLOW.o] Error 1
make[3]: *** [/usr/src/linux-headers-4.19.0-17-common/Makefile:1562: _module_/var/lib/dkms/ipt-netflow/2.3/build] Error 2
make[2]: *** [Makefile:146: sub-make] Error 2
make[1]: *** [Makefile:8: all] Error 2
make[1]: Leaving directory '/usr/src/linux-headers-4.19.0-17-686-pae'
make: *** [Makefile:25: ipt_NETFLOW.ko] Error 2

Reason for this seems to be these entries in the upstream Linux changelog for kernel 4.19.191:

    - modules: mark ref_module static
    - modules: mark find_symbol static
    - modules: mark each_symbol_section static
    - modules: unexport __module_text_address
    - modules: unexport __module_address
    - modules: rename the licence field in struct symsearch to license
    - modules: return licensing information from find_symbol
    - modules: inherit TAINT_PROPRIETARY_MODULE

Cause is probably the first line of it which comes from commit torvalds/linux@8745aa4e018e4159f8c49d9689251f51fd0a15e0:

commit 8745aa4e018e4159f8c49d9689251f51fd0a15e0
Author: Christoph Hellwig <hch@lst.de>
Date:   Thu Jul 30 08:10:20 2020 +0200

    modules: mark ref_module static

    commit 7ef5264de773279b9f23b6cc8afb5addb30e970b upstream.

    ref_module isn't used anywhere outside of module.c.

    Signed-off-by: Christoph Hellwig <hch@lst.de>
    Signed-off-by: Jessica Yu <jeyu@kernel.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

Reading the reason for this change sounds as if this could be a bug in either ipt_NETFLOW or the Linux kernel. Not sure.

Update: This issue is also tracked in the Debian Bug Tracking System as #990123.

xtaran commented 3 years ago

That ref_module issue might be a compile-time option related issue or missing #include statements or such. I though suspect that the fix in 5aae3791922bd3df878605b15e83ea48a4bd096c and ipt_NETFLOW is the right way to fix this, at least in short terms, just with some more version checks for long-term support kernels where this has crept in, too. Not sure if that suffices, though. Will dig deeper later today.

aabc commented 3 years ago

Thanks. Yes, seems it needs some auto-detection checks.

xtaran commented 3 years ago

That ref_module issue might be a compile-time option related issue or missing #include statements or such. I though suspect that the fix in 5aae379 and ipt_NETFLOW is the right way to fix this

Nope, I was wrong here. As Salvatore Bonaccorso (@carnil) of the Debian Kernel Team pointed out (Thanks!) the actual fix is adfc631816ea690cbf53c03a9f40b6c4c5be0a21 for kernel 5.9. So it's actually the resurfacing of #153, just with an older kernel version.

And I forgot that the Debian package of 2.5.1 already has that fix, but it so far is not enabled for kernels below 5.9, so that's the reason why the 2.5.1 Debian package didn't work with that kernel, despite I cherry-picked that patch.

For the Debian package, I will try to backport that fix to ipt_NETFLOW 2.3 and update the conditional. It though might be a good idea to expand that conditional generally for those kernel versions where https://github.com/torvalds/linux/commit/7ef5264de773279b9f23b6cc8afb5addb30e970b got backported. I'm though so far only aware of the 4.19 line of stable kernels.

aabc commented 3 years ago

So, I don't change anything?

xtaran commented 3 years ago

So, I don't change anything?

The following sentence might have been hidden inmidst a paragraph of mine. I think this should change for ipt_NETFLOW generally, not just in the Debian package:

It though might be a good idea to expand that conditional generally for those kernel versions where torvalds/linux@7ef5264 got backported.

Here I refered to this #if:

#if LINUX_VERSION_CODE >= KERNEL_VERSION(5,9,0)

I imagine something like this (syntax might not be correct, i.e. untested):

#if LINUX_VERSION_CODE >= KERNEL_VERSION(5,9,0) || ( LINUX_VERSION_CODE >= KERNEL_VERSION(4,19,191) && LINUX_VERSION_CODE < KERNEL_VERSION(4,10,0) 

I will do that for the Debian package in Debian stable, too, but I'll focus on those kernel versions provided by Debian. So my patch will catch some kernel versions where this got backported to, but maybe not all of them. I'm also not sure how to figure out to which kernel lines this got or gets backported.

carnil commented 3 years ago

Hi

So, I don't change anything?

The following sentence might have been hidden inmidst a paragraph of mine. I think this should change for ipt_NETFLOW generally, not just in the Debian package:

It though might be a good idea to expand that conditional generally for those kernel versions where torvalds/linux@7ef5264 got backported.

Here I refered to this #if:

#if LINUX_VERSION_CODE >= KERNEL_VERSION(5,9,0)

I imagine something like this (syntax might not be correct, i.e. untested):

#if LINUX_VERSION_CODE >= KERNEL_VERSION(5,9,0) || ( LINUX_VERSION_CODE >= KERNEL_VERSION(4,19,191) && LINUX_VERSION_CODE < KERNEL_VERSION(4,10,0) 

I will do that for the Debian package in Debian stable, too, but I'll focus on those kernel versions provided by Debian. So my patch will catch some kernel versions where this got backported to, but maybe not all of them. I'm also not sure how to figure out to which kernel lines this got or gets backported.

There where (so far) the following stable series which contained the change (plus mailine in 5.9-rc1):

v4.14.233: 52d03d9947ad805651b5fa5289f75a387756f36b modules: mark ref_module static v4.19.191: 8745aa4e018e4159f8c49d9689251f51fd0a15e0 modules: mark ref_module static v5.4.118: 6e38daf2e5db63611fa52ba57b9089d3be1a345d modules: mark ref_module static v5.9-rc1: 7ef5264de773279b9f23b6cc8afb5addb30e970b modules: mark ref_module static

aabc commented 3 years ago

Thanks, I'm going to fix this ASAP!

aabc commented 3 years ago

I just pushed commit that should fix this issue. Would be grateful for testing it. Thanks.

xtaran commented 3 years ago

Thanks @carnil and @aabc!

Was just starting to extend adfc631 with a more complex #if, but @aabc's solution in 352cdb2 seems to be a much simpler fix. Will test that one and otherwise propose a different one. (Will probably need to merge adfc631 and 352cdb2 into a single patch for patching 2.3 in Debian 10 Buster (current stable), but will also try to test git HEAD.)

xtaran commented 3 years ago

Was just starting to extend adfc631 with a more complex #if, but @aabc's solution in 352cdb2 seems to be a much simpler fix. Will test that one and otherwise propose a different one.

JFTR: I tested these two patches against 2.3 in Debian Stable and they seem to work fine for more than a week now.

aabc commented 3 years ago

Thanks!