ropensci / unconf15

rOpenSci's San Francisco hackathon/unconf 2015
http://unconf.ropensci.org
35 stars 7 forks source link

PEG implementation for R #14

Open sckott opened 9 years ago

sckott commented 9 years ago

There are Parsing expression grammars (PEGs) implementations in various languages, including Ruby's treetop

A PEG implementation in R could be super useful for me at least. The use case I have is that right now in taxize we use a web API to parse scientific names, which are often very hard to parse. The web API is Ruby based and uses the treetop gem (the API GlobalNamesArchitecture/biodiversity). It does a very nice job, but it would be nice to have a native R PEG so that we don't have to use a web API.

I imagine there are other use cases.

If this is already done in R, awesome, where's it at?

richfitz commented 9 years ago

If it's not done, then porting boost spirit with Rcpp is one option that could be fun.

sckott commented 9 years ago

Thanks Rich. I'll take a look

hadley commented 9 years ago

There's going to be a pretty strong mismatch between compile time boost::spirit and run-time R. I suspect it might be easier to solve with boost::spirit for specific grammars.

sckott commented 9 years ago

thanks @hadley - make sense

richfitz commented 9 years ago

I guess then the decision would be: build off the boost one and provide nice interfaces for people to add their own grammars (in the approach of some of the header-only Rcpp libraries - mostly focussing on things like nice utility functions for doing type coercion) or port a PEG from a dynamic language.