Lambosaurus / py-c-preprocessor

Tools for preprocessing C files in python
MIT License
15 stars 3 forks source link

Added support for variadic parameters + fixed unwanted macro expansion in struct-like field/variable dereferences (with -> or .) with the same name as macro #12

Closed NotYourFox closed 1 month ago

NotYourFox commented 1 month ago

Variadic macros, such as

#define var(...) __VA_ARGS__
var(1, 2 3, "abc") // expands to: 1, 2 3, "abc"

should now be supported.

Macros should now have the following behaviour with struct-like dereferences:

#define test 1
test->field; // not expanded
struc->test; // not expanded
test // 1
(test)->field; // (1)->field;

The code seems to pass old and new tests.

Lambosaurus commented 1 month ago

Hey, thanks for contributing. I'll have a look through this tonight sometime.

I'm not certain about the behavior here regarding expansion of struct like dereferences. I would have expected these values to be replaced as they originally were. I definitely depend on this behavior in my own C libraries. Perhaps you could share the context of why you are making this change?

NotYourFox commented 1 month ago

I was using the tool for my purposes and I had the problem of a struct field conflicting with macro name. This occasion is very specific, but I still thought it would be nice to just fix it, as normal preprocessors shouldn't expand names in such cases (unless in parentheses, like test case 4 shows).

Honestly, I would even advocate for checking if a variable declaration exists with some name to not expand macro in those cases, but that would require parsing the whole C AST, which neither of us are a big fan of, I guess...

Lambosaurus commented 1 month ago

Sorry NotYourFox, This sounds like a problem that should just be resolved by changing your own macros.

The behavior of this python module should be to mimic the C preprocessor as closely as possible - and so the existing behavior remains the correct behavior. Changing it to suit your use case here would just break it for anyone who is using it for C inspection.

I do want to see the varargs change implemented - so I may cherrypick those changes.

EDIT: Just to support my assertion that the existing behavior is correct, try the following code in onlinegdb or similar. You'll find that it prints test.test indicating both replacements occur.

#include <stdio.h>

struct {
    const char * field;
    const char * test;
} struc = {
    .field = "struct.field",
    .test = "struc.test",
};

struct {
    const char * field;
    const char * test;
} test = {
    .field = "test.field",
    .test = "test.test",
};

#define struc test
#define field test

int main()
{
    printf("%s\r\n", struc.field);
    return 0;
}
NotYourFox commented 1 month ago

Well, it seems like you are totally right about standard macros. I have figured that this expansion problem arises with function-like macros, so

#include <stdio.h>

struct {
    const char * field;
    const char * test;
} struc = {
    .field = "struc.field",
    .test = "struc.test",
};

struct {
    const char * field;
    const char * test;
} test = {
    .field = "test.field",
    .test = "test.test",
};

#define struc(x) test

int main()
{
    printf("%s\r\n", struc.field);
    return 0;
}

prints out "struc.field" in the compiler, but the preprocessor tries to expand "struc(x)" macro and throws an exception "Macro struc expects arguments". This shouldn't happen. I guess it is viable and possible to fix it only for this kind of macros, maybe introduce name mangling (in a sort of way which would be illegal for C syntax to mimic) in the expansion time, after checking for duplicate macros?

I will soon create a merge request with only the variadic macros thing involved, but I am looking forward to your response to the above.

Lambosaurus commented 1 month ago

Ah... I see. This clarifies it a bit. Your right, macros with arguments need to be handled differently.

I don't think a complicated solution is required. The issue is with the lines below: https://github.com/Lambosaurus/py-c-preprocessor/blob/29e9a2d0d30b257d85a08dd778337ccc7526bbaf/preprocessor.py#L461-L463

If arg_start cannot be located, then currently we throw (we have found the token, but no argument list). Instead we should just ignore the token, treating it as no match.

If you'd like to take a look at this, please do - otherwise I may take a look at it over the weekend.

Regarding the varargs, please ensure you've got a test cases that include varargs mixed with regular args. Go ahead with this as a separate PR - I'd prefer multiple smaller PR's than one big one.

p.define("MACRO_B", "a, __VA_ARGS__, b", ["a", "b" "..."])
test_assert(p.evaluate('MACRO_B(1, 2, 3, 4)', "1, 3, 4, 2")

Thanks again for contributing.

NotYourFox commented 1 month ago

On point, I've noticed an issue with mixing varargs with regular ones, now it should be fixed.