grain-lang / grain

The Grain compiler toolchain and CLI. Home of the modern web staple. 🌾
https://grain-lang.org/
GNU Lesser General Public License v3.0
3.25k stars 114 forks source link

Regex Syntax #1838

Open spotandjake opened 1 year ago

spotandjake commented 1 year ago

Currently to use regex you have to use the regex lib to compile your regex expression which returns a match it would be a nicer user experience if we had syntax for regex and moved that validation step to compile time this would also help lower the possibility of a runtime bug.

renatoalencar commented 10 months ago

I'm thinking about building this one, the regex syntax is compatible with OCaml's? I think for validating on this one we could have simpler syntax and implementation and iterate over that.

Here my ideas:

ospencer commented 8 months ago

Hey @renatoalencar! Apologies for the delayed response. I'm not sure if the regex syntax is completely compatible with OCaml's; I'd need to look a little into it. We use the Perl-style regexes.

Syntax-wise I like the idea of starting with r and using slashes. Another idea: we currently use the b prefix on strings for Bytes, like b"foo". It might work to use r in the same way, like r"foo". The downside of this syntax would be that it'd be confusing for people who have seen that exact syntax used for raw strings in other languages.

Ideally, the compiler would see the regex and then desugar it into the automaton directly, but for phase 1 I think it's fine to just introduce a dependency on the Regex module and make a call to Regex.make() like you mentioned. In reality, I'd move the core of that functionality to runtime so we don't need to make a ton of changes to the compiler.

I'll also throw out that part of me was waiting until we finally implemented macros... then when you wanted to use a regex you would just import the macro and do re`foo` (or whatever our macro invocation syntax ended up being). I think that doing it the way you mentioned is a perfectly reasonable approach for now, though.

spotandjake commented 8 months ago

I was looking into this a bit earlier and I think calling Regex.make isnt a good aproach as when using static regex It would be preferable to not have to unwrap the result. What we can do though is just move the enum ParsedRegularExpression into the runtime. and just generate the enum. Hard part will really be validation, but I dont think we can provide a good developer experience without prevalidating anyways, another possible approach would be to add an makeUsafeRegex. which would be Regex.make without the validation step.

renatoalencar commented 8 months ago

@ospencer I think I'm going to make a few experiments with different syntaxes and see how it goes. But probably r with slashes for now.

@spotandjake I like the idea of doing validation at compile time + makeUnsafeRegex at runtime. Then later move to doing the whole thing at compilation.