Closed goodmami closed 3 years ago
The latest ERG makes use of the new 'mask' operator (=) for REPP, as described in the email thread starting here:
=
http://lists.delph-in.net/archives/developers/2020/003107.html
Essentially, substrings matching a mask pattern are prevented from further modification. For example, the following masks email addresses such that later punctuation-splitting rules do not break up email addresses:
=<?[\p{L}\p{N}._-]+@[\p{L}\p{N}_-]+(?:\.[\p{L}\p{N}_-]+)*\.[\p{L}\p{N}]+>?
Masked sections can be tracked with a BIO sequential-tagging scheme so adjacent masks work even when content is inserted between them.
The latest ERG makes use of the new 'mask' operator (
=
) for REPP, as described in the email thread starting here:http://lists.delph-in.net/archives/developers/2020/003107.html
Essentially, substrings matching a mask pattern are prevented from further modification. For example, the following masks email addresses such that later punctuation-splitting rules do not break up email addresses:
Masked sections can be tracked with a BIO sequential-tagging scheme so adjacent masks work even when content is inserted between them.