Closed mflatt closed 2 weeks ago
Updated:
^
and end-of-line $
, added bol
and eol
any
pattern to be any character, but .
is still any except newlineenable_newline
and disable_newline
lookback
to lookbehind
(which seems to be the more established term)^
as charset complement, keeping !
as complementThere's certainly more to do for improving documentation (maybe after splitting the Rhombus manual into guide and reference documents), but I'll merge, and we can take further improvements from there.
This regular expression sublanguage is based on @CooperCorad's BS thesis.
Rendered documentation: https://users.cs.utah.edu/~mflatt/tmp/rhombus-rx/rhombus/regexp.html
Different from Racket:
The
rx
sublanguage is encoded in shrubbery notation, not a string. There's also a subsublanguage for character sets (i.e., the subsublanguage within[]
).Matching is whole-input by default. To select partial-input matching, use
rx_in
to construct a regexp or use a method likeRX.match_in
instead ofRX.match
.The primary syntax for a capture group gives it a name, instead of just a position. These names are bound as variables when
rx
orrx_in
is used as a binding form, and they serve as symbol keys in a map (within an object) produced by methods likeRX.match
.The
rx
form is not provided by just#lang racket
. Instead, useimport rhombus/rx open
, which introduces only a few bindings in the expression and binding spaces but many bindings in the regexp-pattern and character-class spaces.Inherited from Racket:
This draft includes some changes relative to the version shown in at the August 22 meeting (#180) based on feedback in that meeting — so, thanks to Robby, Alex, and Ben.
The documentation has lots of examples, and you find more in the test suite: https://github.com/mflatt/rhombus/blob/rx/rhombus/rhombus/tests/rx.rhm