Open marcosps opened 2 months ago
hummm, so the function is defined using a macro:
DEFINE_IDTENTRY_RAW(int80_emulation)
{
...
}
Which triggers the "not reliable" source text function definition in clang-extract and then it outputs
__attribute__((noinline)) __attribute__((no_instrument_function)) __attribute__((section(".noinstr.text"))) __attribute__((no_profile_instrument_function)) void int80_emulation(struct pt_regs *regs)
but it had defined:
#define noinline __attribute__((__noinline__))
which now expands __attribute__((noinline))
to __attribute__((__attribute__((noinline))))
, which is nonsense.
We could either improve PrettyPrint to correctly detect such functions and output DEFINE_IDTENTRY_RAW(int80_emulation)
, or change clang AST dump of __attribute__((inline))
to __attribute__((__inline__))
in an attempt to avoid symbol clashing.
Or add an extra pass to PrettyPrint to do a search'n'replace on the text generated by clang's AST dumper to also catch future cases like this.
hummm, so the function is defined using a macro:
DEFINE_IDTENTRY_RAW(int80_emulation) { ... }
Which triggers the "not reliable" source text function definition in clang-extract and then it outputs
__attribute__((noinline)) __attribute__((no_instrument_function)) __attribute__((section(".noinstr.text"))) __attribute__((no_profile_instrument_function)) void int80_emulation(struct pt_regs *regs)
but it had defined:
#define noinline __attribute__((__noinline__))
which now expands
__attribute__((noinline))
to__attribute__((__attribute__((noinline))))
, which is nonsense.
But, in this case, isn't it a problem to have noinline
defined either way? I mean, isn't clang-extract correct to expand to macro in this case, but the code is somehow misleading? On gcc -E
, I get the following output:
__attribute__((__externally_visible__)) __attribute__((__noinline__)) __attribute__((no_instrument_function)) __attribute((__section__(".noinstr.text"))) __attribute__((__no_sanitize_address__)) __attribute__((__no_profile_instrument_function__)) void int80_emulation(struct pt_regs *regs);
And by looking at DEFINE_IDTENTRY_RAW(int80_emulation)
, the expansion ends up being something like the below on upstream kernel:
__visible noinstr void func(struct pt_regs *regs)
And noinstr
is resolved to:
#define noinstr __noinstr_section(".noinstr.text")
Which turns __noinstr_section
into:
#define __noinstr_section(section) \
noinline notrace __attribute((__section__(section))) \
__no_kcsan __no_sanitize_address __no_profile __no_sanitize_coverage \
__no_sanitize_memory
And finally noinline
is transformed into:
#define noinline __attribute__((__noinline__))
But I'm not able to find any definition to __attribute__((__noinline__))
or __noinline__
alone. So, if only noinline
is when expanding the macro, where is __attribute__((noinline))
coming from?
We could either improve PrettyPrint to correctly detect such functions and output
DEFINE_IDTENTRY_RAW(int80_emulation)
, or change clang AST dump of__attribute__((inline))
to__attribute__((__inline__))
in an attempt to avoid symbol clashing.Or add an extra pass to PrettyPrint to do a search'n'replace on the text generated by clang's AST dumper to also catch future cases like this.
But I'm not able to find any definition to
__attribute__((__noinline__))
or__noinline__
alone. So, if onlynoinline
is when expanding the macro, where is__attribute__((noinline))
coming from?It comes from clang's AST dump.
__attribute__((__noinline__))
, __noinline__
and noinline
are not macros by default, but rather compiler keywords. The semantic effect of __attribute__((__noinline__))
is the same of __attribute__((noinline))
. Because of that, clang AST dump will dump __attribute__((__noinline__))
as __attribute__((noinline))
.
The problem here is that the kernel source defines a macro #define noinline __attribute__((__noinline__))
, which then will change the semantics of the inline keyword, once it will be replaced with __attribute__((__noinline__))
.
If the kernel had #define __noinline__ __attribute__((noinline))
the problem would be similar, but in this case the output would be correct because clang AST dump always output attribute((noinline)).
That is why I suggested adding an extra pass on PrettyPrint.cpp to detect those cases and change all instances of __attribute__((noinline))
to __attribute__((__noinline__))
when detected. And possibly other cases too.
@giulianobelinassi thanks for the explanation! It seems a good idea to make PrettPrint.ccp do the work then. If we have more cases like this in the future we can work then around, as you suggested.
To reproduce it:
klp-build setup --kdir --data-dir /home/mpdesouza/git/linux/ --name test_int80 --file-funcs arch/x86/entry/common.c int80_emulation --conf CONFIG_X86_64
klp-build extract --name test_int80
And now, an assert is being hit:
clang-extract: ../libcextract/SymbolExternalizer.cpp:934: bool SymbolExternalizer::_Externalize_Symbol(const std::string &, ExternalizationType): Assertion `ids.size() > 0 && "Decl name do not match required identifier?"' failed.
For now we can't extract this function (is this really a function?)
When trying to extract symbols int80_emulation from arch/x86/entry/common.c: