well-typed / hs-bindgen

Automatically generate Haskell bindings from C header files
20 stars 0 forks source link

Support macros: constants #44

Open edsko opened 3 months ago

edsko commented 3 months ago

One simple shape of macros that we might want to support for constants, i.e. something like

#define x 1

There are multiple types of constants we might consider

as well as more specialized literals such as

https://en.wikipedia.org/wiki/C_syntax#Compound_literals It's not entirely clear what the type of the generated binding should be, since this value can be used in lots of contexts. Perhaps it would be useful to introduce a Haskell class for interpreting such [integer constants]

class FromCIntegerConstant a where
  fromCIntegerConstant :: Integer -> a

with similar classes for other constants. For integers specifically we would have to ensure that the way that was parse the textual representation of the Integer matches what the C compiler would do (decimal, binary, octal, hexadecimal, possible size specifiers such as 12lu, ..). Similar remarks apply to string literals also, the interpretation of all escape sequences would have to match the C compiler exactly. Ideally we'd therefore use libclang for this.

edsko commented 3 months ago

For a simple example, see this definition

#define TALISE_ARM_ORX1_TX_SEL0_SIGNALID                0x00
#define TALISE_ARM_ORX1_TX_SEL1_SIGNALID                0x01
#define TALISE_ARM_ORX2_TX_SEL0_SIGNALID                0x02
#define TALISE_ARM_ORX2_TX_SEL1_SIGNALID                0x03
#define TALISE_ARM_TXCAL_ENA_SIGNALID                   0x04
#define TALISE_ARM_CAL_UPDATE0_SIGNALID                 0x05
#define TALISE_ARM_CAL_UPDATE1_SIGNALID                 0x06
#define TALISE_ARM_CAL_UPDATE2_SIGNALID                 0x07
phadej commented 2 months ago

See #135. The libclang seems to expand all macros. Maybe we can see them from somewhere, but not a C declarations for sure.

edsko commented 2 months ago

I'm looking at the libclang side of this now.

edsko commented 2 months ago

As of #152, we now have that

#define MYFOO 1
#define INCR(x) x + 1

get represented as

DeclMacro $ Macro ["MYFOO", "1"]
DeclMacro $ Macro ["INCR", "(", "x", ")", "x", "+", "1"]

respectively. It seems we cannot do an awful lot better with current libclang, but perhaps this is good enough. I haven't yet experimented with what happens if you have comments inside your macro; https://discourse.llvm.org/t/extracting-macro-information-using-libclang-the-c-interface-to-clang/34476/4 uses the same approach we use (essentially just using clang_tokenize), but then uses clang_getTokenKind to filter out some tokens; perhaps we will need to do the same.

edsko commented 1 month ago

Self-assigning because I am working on the parser for macros.