Closed getify closed 1 year ago
On Clojure, instaparse is using Java regexes. I'm less familiar with the Clojurescript port, but it wouldn't surprise me if the Clojurescript port uses whatever regexes are supported by Javascript, specifically whatever is supported by the Google Closure compiler utilized by clojurescript.
Thank you. That should have been obvious to me, but I hadn't thought to check.
I just did a search through the minified bundle on that web tool that is the port of instaparse, and its usages of JS RegExp
indeed do not seem to be passing the unicode flag. So I think that explains my issue.
Not sure what I'll be able to do to work around this. But appreciate the pointer.
Just curious: does Clojure or Instaparse have any facility that could be used to force the underlying regex to be unicode aware? I suppose, as you mentioned, that it's entirely up to the compiler (google closure).
I can't think of anything beyond whatever clojurescript does.
I worked around my issue by not using a regex and just including the string literal characters in my productions.
I've been using a helpful web tool that utilized a Clojurescript port of instaparse. Their web tool lets me author/test productions in a browser web page (which I've found very convenient, especially when collaborating with others).
Unfortunately, I'm having a problem with some regex needs. I was using a syntax I believe to be valid for Clojure/Java regexes, specifically for specifying unicode characters by code point.
Details of my issue are here: https://github.com/mdkrajnak/ebnftest/issues/1
I wanted to cross-post here in case you might share some insight (or any links you may have) into what specific regex syntax I need to use? Thank you.