Closed Drup closed 2 years ago
I looked at doing this some time back; if I remember correctly, the main stumbling block was the lack of a uu*
-compatible Unicode regexp engine.
One possibility is to take the one inside camomile
and retarget it to use uu*
.
What kind of operation do we need exactly for zed ? A very basic functorized regexp engine (that could even run on ropes directly) is not difficult, if we only need to support a very limited set of features. It would also have wider utility (and might grow over time).
The "difficult" feature (and from which most of Re's insanity come from) is efficient handling of capture groups in the context of on-the-fly determinization. :3
What kind of operation do we need exactly for zed ?
I forget, but probably not much more than basic search and replace.
A very basic functorized regexp engine (that could even run on ropes directly) is not difficult, if we only need to support a very limited set of features. It would also have wider utility (and might grow over time).
I agree. By the way, that's exactly the approach taken by camomile
(see https://github.com/yoriyuki/Camomile/blob/master/Camomile/public/uRe.mli#L90). As far as I remember it looked like the code could be taken out of the main library without much difficulty.
IIRC, the only use of regexp inside Zed is for finding word boundaries. The API could just take a general search function instead and it would be up to the user to use regexps if they want to.
Closed by #46, I'll make a release with it in the coming days.
In the recent 2.0 release, there was a bunch of new modules that abstract away some of the utf8/uchar API used in zed. Following this, it would be nice to remove the dependency to camomile and only use the minimal set of
uu*
libraries needed for zed to work.It would help reduce the size of the dependencies, especially since camomile tends to bloat binaries quite a bit.