jethrogb / rust-cexpr

A C expression parser and evaluator
Apache License 2.0
45 stars 19 forks source link

Provide support for C functional macros #3

Open Slabity opened 7 years ago

Slabity commented 7 years ago

Currently, cexpr cannot produce values that use functional macros to calculate. It would be very useful if cexpr supported them.

jethrogb commented 7 years ago

cexpr support is almost trivial, assuming these functional macros evaluate into proper expressions. If this is useful I can implement it. Note however that you must indicate to cexpr whether you'd be parsing a functional macro or a regular macro, because there is no way to distinguish between the two at the token level.

Slabity commented 7 years ago

there is no way to distinguish between the two at the token level.

How so?

jethrogb commented 7 years ago

Both of

m(a) -1
m (a) -1

are

[
  Identifier("m"),
  Punctuation("("),
  Identifier("a"),
  Punctuation(")"),
  Punctuation("-"),
  Literal("1")
]
Slabity commented 7 years ago

Could there be a Whitespace token to differentiate the two?

jethrogb commented 7 years ago

Possibly. But you need a lexer that generates those tokens. Clang does not.

emilio commented 7 years ago

Hi, @jethrogb, would it be fine to give you a boolean that tells whether the current macro definition is function-like?

Clang has an API for that (clang_Cursor_isMacroFunctionLike), and it'd solve some bindgen issues like https://github.com/servo/rust-bindgen/issues/753

jethrogb commented 7 years ago

Yes that would work. cexpr just needs to know whether it's parsing a functional or regular macro. I can probably look at this on Sunday.

Note: clang_Cursor_isMacroFunctionLike was introduced relatively recently and is e.g. not available on Ubuntu 16.04.

Also notably, clang_Cursor_Evaluate was introduced in the same commit, but I think it's not as powerful as cexpr.

emilio commented 7 years ago

Right both are libclang 3.9+. But nicely enough bindgen loads libclang at runtime so can detect which functions are available, so it'd be nice to start using them when available.

nicokoch commented 7 years ago

Any update on this? :)

jethrogb commented 7 years ago

It was too hot to do any work ;) Next weekend... Or feel free to pick it up yourself before then.

Slabity commented 7 years ago

@jethrogb Any update?

jethrogb commented 7 years ago

Was going to do this this morning, but got distracted by 0ea1367 and 4fdd26b

jethrogb commented 7 years ago

Parsing functional macro declarations is now supported in fe05507.

I suppose you'd also like to evaluate them if they're used as part of a another macro definition? Because of token-pasting, stringizing and non-reentrant evaluation, that's not super easy. Reference: https://www.mirbsd.org/htman/i386/manINFO/cppinternals.html http://port70.net/~nsz/c/c89/c89-draft.html#3.8.3 https://superb-dca2.dl.sourceforge.net/project/mcpp/mcpp/V.2.7.2/mcpp-summary-272.pdf http://c0x.coding-guidelines.com/6.10.3.pdf http://c0x.coding-guidelines.com/6.10.3.1.pdf

emilio commented 7 years ago

Awesome!

Yeah, I think having support to resolve a macro inside other macro is what can make it useful for Bindgen... Otherwise there isn't much utility on it... I think the most usual request is for simple macros that resolve to numeric types, for which there shouldn't be any token-pasting issue I guess? Something like:

#define MY_FLAG(i) (1 << (i))

#define MY_VAR_1 MY_FLAG(1)
#define MY_VAR_2 MY_FLAG(2)
#define MY_VAR_3 MY_FLAG(3)

Supporting that would be quite nice IMO.

jethrogb commented 7 years ago

I've investigated this further and I think best results would be obtained with a mostly-implemented mostly-compliant C preprocessor. The preprocessor is completely separate from the expression evaluation implemented by this crate. The preprocessor would basically take a macro replacement list and fully expand the list of tokens which could then be evaluated by cexpr. There are a couple of options:

  1. Figure out how to leverage Clang's preprocessor. I looked into this but the C API doesn't seem to have anything. There's clang_Cursor_Evaluate which could do both jobs but it has some limitations, for example it doesn't distinguish chars from ints. The C++ API might be useful according to this StackOverflow post but that's an unstable API with no dynamic libraries.
  2. Incorporate a stand-alone preprocessor such as mcpp.
  3. Write one in Rust.
photoszzt commented 7 years ago

Would this be useful: https://stackoverflow.com/questions/13881506/retrieve-information-about-pre-processor-directives https://stackoverflow.com/questions/10113586/how-can-i-parse-macros-in-c-code-using-clang-as-the-parser-and-python-as-the ?

jethrogb commented 7 years ago

No. We already do that.

photoszzt commented 7 years ago

Directly calling clang -E isn't an option?

Slabity commented 6 years ago

Any progress on this? It's still impossible to use the _IO family of macros.

I use to be able to hack around this by rebinding them as constants, but this no longer works for some reason.

Slabity commented 6 years ago

Pinging @jethrogb - Anything I can do to help fix this?

jethrogb commented 6 years ago

@Slabity Yes, the next steps are described in https://github.com/jethrogb/rust-cexpr/issues/3#issuecomment-315645257

gnzlbg commented 5 years ago

So an mpi-sys crate needs to expose constants that are implemented slightly differently by the different MPI implementations, for example, see MPI_REQUEST_NULL in:

 #define MPI_REQUEST_NULL   ((MPI_Request)0x2c000000)
#define MPI_REQUEST_NULL OMPI_PREDEFINED_GLOBAL(MPI_Request, ompi_request_null)

it would be great if rust-bindgen could import these as consts, instead of as static mut.

pvdrz commented 1 year ago

I've investigated this further and I think best results would be obtained with a mostly-implemented mostly-compliant C preprocessor. The preprocessor is completely separate from the expression evaluation implemented by this crate. The preprocessor would basically take a macro replacement list and fully expand the list of tokens which could then be evaluated by cexpr. There are a couple of options:

1. Figure out how to leverage Clang's preprocessor. I looked into this but the C API doesn't seem to have anything. There's clang_Cursor_Evaluate which could do both jobs but it has some limitations, for example it doesn't distinguish `char`s from `int`s. The C++ API might be useful according to [this StackOverflow post](https://stackoverflow.com/questions/39529480/is-there-a-way-to-get-source-code-with-macro-expanded-using-clang-api) but that's an unstable API with no dynamic libraries.

2. Incorporate a stand-alone preprocessor such as [mcpp](http://mcpp.sourceforge.net/).

3. Write one in Rust.

:wave: I'm writing a C preprocessor from scratch and I'd like to know what kind of API do you have in mind for the expansion. Would you pass the preprocessing directive defining the macro and the tokens where the macro is supposed to be expanded?

photoszzt commented 1 year ago

Before writing a new one, what's missing from the mcpp? Is it just hard to use in Rust?

pvdrz commented 1 year ago

Well you'd have to write rust bindings for mcpp by hand (or using bindgen but we would have to see if bindgen can generate valid rust bindings for all the items). At the same time the project seems to be unmaintained, the last release was in 2008 and the last update to their sourceforge was 2013. It could be fine as they are C99 compliant but who knows.