ClangBuiltLinux / linux

Linux kernel source tree
Other
241 stars 14 forks source link

CFI: Function type mismatches reported by -Wincompatible-function-pointer-types-strict #1750

Open samitolvanen opened 1 year ago

samitolvanen commented 1 year ago

My work-in-progress -Wincompatible-function-pointer-types-strict Clang warning found ~30 more type mismatches in function pointer assignments, which we should look into fixing.

There are roughly the same number of warnings for both x86_64 and arm64 allmodconfigs, mostly duplicates, except for an arch-specific driver or two. All the warnings seem to be automatic integer <-> enum conversions, similar to issue #1703, each of which could potentially trip CFI checking.

cc @kees @nathanchance @GustavoARSilva

nathanchance commented 1 year ago

That does not seem too bad. I can probably take a look at all of those and send patches for the ones that do not have any outstanding patches, as I think some of those should have been caught in #1703.

nathanchance commented 1 year ago

I have written fixes for all the warnings that I could find with my build testing framework. I just need to add commit messages and fully vet the counter changes, as that one was particularly unfortunate for how large the fix was and how much type information was stripped from the callback functions:

https://gist.github.com/nathanchance/69c392675051e992f8a54c1feaa259ed

Everything else was not bad in my opinion. I pinged a couple patches that appeared to be left behind from Huck's initial blast of patches for #1703 that would be relevant for this warning so that I do not duplicate existing work. I should be able to get everything sent out today or tomorrow.

I would like to get this warning enabled for 6.2 if possible, as it would make catching potential CFI failures more reliable (unfortunately, it would not catch a lot of the sysfs ones that I have found but that is going to be nearly impossible at compile time) and the Intel folks upgrade the tip of tree toolchain weekly so we would get quick access to build testing various patches. In order to guarantee that in a reliable manner, it would make most sense for someone like @kees or Andrew Morton to carry the fixes and the enablement patch in the same tree but it might be early enough in the cycle to get the patches all picked up before the merge window then enable the warning before 6.2-rc1.

kees commented 1 year ago

Yeah, I'm happy to carry whatever. The tradition has been "if someone else does a review/ack but the maintainer ignores it, either Gustavo or I will take it via our tree."

nathanchance commented 1 year ago
nathanchance commented 1 year ago

Good thing I decided to start testing this against -next...

drivers/net/ethernet/renesas/rswitch.c:1533:20: error: incompatible function pointer types initializing 'netdev_tx_t (*)(struct sk_buff *, struct net_device *)' (aka 'enum netdev_tx (*)(struct sk_buff *, struct net_device *)')
with an expression of type 'int (struct sk_buff *, struct net_device *)' [-Werror,-Wincompatible-function-pointer-types-strict]
        .ndo_start_xmit = rswitch_start_xmit,
                          ^~~~~~~~~~~~~~~~~~
1 error generated.

I have added it to the list above.

samitolvanen commented 1 year ago

This issue has become a bit difficult to follow with the ~6k references. Is the drivers/counter issue the only one where a fix hasn't been accepted?

nathanchance commented 1 year ago

This issue has become a bit difficult to follow with the ~6k references.

Ugh, I forgot a lot of these patches got backported to the stable kernels :(

Is the drivers/counter issue the only one where a fix hasn't been accepted?

Yes, I sent an initial revision that the maintainer did not like, same with Kees's follow up suggestion:

https://lore.kernel.org/Y2LhXqZgOAxL47AT@fedora/ https://lore.kernel.org/Y2M3sGLdrL3uHU8X@fedora/

I was unable to come up with a better fix then got distracted with other fires, so those instances are still outstanding but I have not seen anything new come down in -next. I suppose we could disable the counter subsystem with CONFIG_CFI_CLANG for the time being but that feels like a heavy hammer that is unlikely to be accepted.

samitolvanen commented 1 year ago

I was unable to come up with a better fix then got distracted with other fires, so those instances are still outstanding but I have not seen anything new come down in -next.

Hmm, I wonder if we could use a union as the callback argument...?

I suppose we could disable the counter subsystem with CONFIG_CFI_CLANG for the time being but that feels like a heavy hammer that is unlikely to be accepted.

If we can't come up with a reasonable solution, it would probably be better to just add __nocfi to the functions that perform these calls instead.

nickdesaulniers commented 1 year ago

I'm seeing a new instance of this reported by kernelci:

    3    drivers/block/zram/zram_drv.c:1234:23: error: passing argument 1 of ‘atomic64_read’ from incompatible pointer type [-Werror=incompatible-pointer-types]
    2    drivers/block/zram/zram_drv.c:1234:23: error: incompatible pointer types passing 'atomic_long_t *' (aka 'atomic_t *') to parameter of type 'const atomic64_t *' [-Werror,-Wincompatible-pointer-types]
nathanchance commented 1 year ago

Not quite the same as this warning (not function pointer related nor has -strict), there is a fix on the list since that warning happens with GCC: https://lore.kernel.org/20230225121523.3288544-1-geert+renesas@glider.be/

nathanchance commented 11 months ago

I've sent a patch to at least enable this in W=1, as I am still finding instances crop up every so often:

https://lore.kernel.org/20231002-enable-wincompatible-function-pointer-types-strict-w-1-v1-1-808ab955d42d@kernel.org/

I have tried a couple times to understand what William is looking for from his feedback on my series (https://lore.kernel.org/all/Y2LhXqZgOAxL47AT@fedora/, https://lore.kernel.org/all/Y2M3sGLdrL3uHU8X@fedora/) to fix the instances in drivers/counter but I just cannot really wrap my head around it.