haskell / c2hs

c2hs is a pre-processor for Haskell FFI bindings to C libraries
http://hackage.haskell.org/package/c2hs
Other
198 stars 50 forks source link

Block syntax again #138

Closed acowley closed 9 years ago

acowley commented 9 years ago

I've been working around variations of this for a while now,

c2hs: C header contains errors:

/Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX10.10.sdk/usr/include/dispatch/object.h:143: (column 15) [ERROR]  >>> Syntax error !
  The symbol `^' does not fit here.

It happens when building the OpenCL package from hackage. It can be avoided by editing that file to simply remove the offending character while installing the package, then putting it back so as not to break anything. Is there anything we can do to either c2hs or OpenCL to avoid this?

ian-ross commented 9 years ago

@acowley This isn't something that's really fixable from within C2HS. The problem is just that the C language parser in the language-c package doesn't know about these constructs -- it's explicitly advertised as supporting "C99 with some GNU extensions". To deal with this properly, we'd need to switch over to using one of the other C parsing packages, probably Geoffrey Mainland's language-c-quote. There are two problems with doing that: first, the C AST is different between the language-c and language-c-quote packages, which would mean changes everywhere in C2HS, and second, the language-c package has a number of analysis capabilities that C2HS makes extensive use of, capabilities that aren't available in the other C parser packages and would somehow need to be ported to work with the other parsers (the best suggestion for doing this is to have some sort of thinnish AST conversion layer so that you could parse with language-c-quote, convert the AST to the form produced by language-c and run the analysis passes on that: kind of nasty).

So, in order to fix C2HS to work with "variant" C headers, we'd need to deal with this mess one way or another. I can't think of a quick or non-painful way to do it, and Manuel and I have kind of agreed to leave it for now in the hope that someone with a burning need to support C2HS-based libraries on OS X would deal with it all...

If you'd like to be that gullible lucky person, please go ahead and fix all our problems. Realistically though, it's a big and fairly thankless job. If you have any other ideas about how we might deal with this, I'd be interested to hear them. The only other idea I had (also pretty horrible) is to do a "cleanup" pass on C headers before the current parser used by C2HS gets to see them -- parse the headers using language-c-quote, excise any non-"C99 plus some GNU extensions" bits from the resulting AST, pretty-print the result and let C2HS think that that's the "real" header file. I don't know if that would really work, and it seems like it would become increasingly Rube Goldbergish over time as more and more divergences between language-c and language-c-quote had to be fixed dealt with.

acowley commented 9 years ago

My "solution" was going to be a sed-like pass that would turn a block pointer type into a function pointer type, but yours is arguably more principled.

I appreciate the problem we have here, but it seems rather hopeless. I can look into how the OpenCL package is using c2hs to see if it would be easier to move it to a bespoke solution, or if some kind of hacky wrapper around c2hs makes more sense.

ian-ross commented 9 years ago

So, were you thinking that just dealing with the blocks thing would be enough? Are there any other significant syntactic extensions that have been introduced in the Apple compilers? (I am almost 100% ignorant about OS X development.) My "more principled" approach was based on the idea that there might be other extensions that we would need to deal with in the future, but if that's not the case, some simpler approach just to fix the blocks syntax does seem like the easiest route to take.

acowley commented 9 years ago

It's literally one line with a typedef for a block pointer that gums up the OpenCL package installation on OS X. Addressing it with a string substitution is a defensive maneuver that won't scale too far, but it's far and away the easiest thing to do in the short term.

The line is this,

typedef void (^dispatch_block_t)(void);

My usual hand edit is to just delete the caret, but I suppose replacing (^ with (* might be closer to the intent of the code if that matters.

ian-ross commented 9 years ago

@acowley OK, I've been looking at doing this "properly", and it's too much work for too little reward. I looked at adding block syntax to language-c, but it would require quite a bit of messing around with the parser to make it work (there's no easy way to identify function pointer declarators from other pointer declarators the way the grammar is currently written). If you're happy with the solution you came up with, I'm happy too...

acowley commented 9 years ago

I think you're right to be generally concerned about such a gross hack. This is currently so narrowly focused, I'm tempted to add in more logic to limit its impact. We could, for instance, only run the filter on OS X, and only do it on typedef lines. That would probably make me feel better in that it has less chance of messing things up for anyone else. If it crops up again on OS X, I can sort out another bandage to keep it going. That's not a good state of affairs, but it's better than nothing.

ian-ross commented 9 years ago

Yeah, it's just that fixing it for real is a huge can of worms. You basically need to add the blocks syntax to the language-c parser, which just isn't structured to allow that easily at the moment. So you'd need to refactor the language-c grammar to do it right. I did a very quick experiment where I just duplicated all the pointer declarations using ^ instead of * and that picked up blocks declarations fine, but it also picks up a whole load of stuff that just isn't valid syntax. Making it so that you have a bit of grammar that says "blocks declarations only for function pointers" would take a day or so of staring at the Happy grammar. So it's bandages for now. To be honest, C2HS is kind of hacky in lots of places (I've certainly contributed enough nastiness of my own) and this is just one more of those things. It doesn't have to be pretty, it just has to work.

acowley commented 9 years ago

Do you have any thoughts on how much to narrow the scope of this fix? On my fork, I further restricted the substitution to lines that begin with typedef, and all is well. We could probably also do a darwin #ifdef, but that might be a step too far if this syntax seeps out of the Apple ecosystem.

ian-ross commented 9 years ago

I don't really have any strong feelings about it, to be honest. The typedef idea does sound quite good though, since it will avoid issues with (^ appearing in #defined strings (unlikely, but it's the only real possibility for breakage I could think of). I wouldn't worry about making it Darwin-only.