Excel regular expression add-in using .NET regex engine and ExcelDNA integration.
input
text parameter: empty input strings do not trigger a #VALUE error anymore, null
still does, though ExcelDNA doesn't seem to pass any nulls.RegexGroupMatches()
RegexMatches()
RegexMatch()
return value when not matched (did not return #NA correctly)IsRegexMatch()
, RegexEscape()
, RegexMatchGroups()
RegexMatch()
and RegexReplace()
ExcelMAZRegez is a simple, fast .NET regular expression library for Excel. As of v1 it's only for Excel formulas inside a worksheet; a later version might implement it for use inside VBA.
It's several orders of magnitude faster than using the VBA scripting library, and it's less convoluted than the few other Excel .NET regex libraries I found out there.
See the .Net regular expression documentation for a full description of regexes, including search and replacement patterns, and how options work.
The formula use and syntax show up in the Excel Intellisense UI, and below for reference:
\*-packed.xll
and ExcelDna.IntelliSense.dll
files from the project repository Releases page \*.xll
add-in file (32 or 64 bit) to your Excel add-ins folder (%APPDATA%\Microsoft\AddIns
on Windows).ExcelDna.IntelliSense.dll
file into the add-ins folder.RegexEscape( text )
Return the input text with special characters escaped. Useful to construct regex patterns with arbitrary text that will not be interpreted for special pattern interpretations, such as when a string includes "[", "$", etc.
RegexMatch( input, pattern [, options [, replacement ] ] )
Finds and returns the text of the first instance of the regular expression pattern inside the input string, optionally modified with the option flags, and optionally with a replacement pattern.
The options
are bit flags (see below), and sould be added up to specify more than one optoin. E.g.: ignore case plus multilines is 3 (1 + 2).
if not specified, the replacement patterns defaults to "$0".
RegexMatches( input, pattern [, options [, replacement ] ] )
Finds all the occurrences of the pattern in the input. Returns delimiter-separated list of matches with optional replacement pattern.
RegexMatch()
, plus...delimiter
: Delimiter for the list of results, default ','IsRegexMatch( input, pattern [, options ] )
RegexMatch()
, except for replacement
, which is not used.TRUE if the pattern is found in the input, FALSE otherwise.
RegexMatchGroups( input, pattern [, options [, MaxMatches [, MaxGroups [, IncludeDuplicates ] ] ] ] )
Search the input for matches of the pattern, return a comma delimited list of matching capture group names/numbers in match order.
Useful to find out what chunks of a regular expression were matched against, without actually caring what the text that matched was. I personally use it a lot to label data, see the examples below.
input
, pattern
, and options
parameters are the same as for RegexMatch()
.MaxMatches
: Maximum number of matches to execute on the input (omit or 0 to return all matches)MaxGroups
: Maximum number of group names or numbers to return for each match (omit or 0 for all groups)IncludeDuplicates
: Default TRUE: Print group names every time they're found in a match. FALSE: Only return the first instance of each capture group.The formula:
=RegexMatchGroups('liliac,red,mauve,green','<?<primary>red|green|blue)(?<artsy>mauve|lilac|haze)'
)artsy,primary,artsy,primary
To only return the first match, you'd use:
=RegexMatchGroups('mauve,red,green','<?<primary>red|green|blue)(?<artsy>mauve|lilac|haze)'
,,1)artsy
To return all matches but not repeat group names, you set IncludeDuplicates=FALSE
:
=RegexMatchGroups('mauve,red,green','<?<primary>red|green|blue)(?<artsy>mauve|lilac|haze)'
,,,,FALSE)artsy,primary
The MaxGroups
sets the max number of groups per match. Handy, for example, when different subexpressions may match the same text, and you only care for the first group that does. For example, see the difference between not using MaxGroups
:
=RegexMatchGroups('mauve,red,green','(?<funky>green|lilac)<?<primary>red|green|blue)(?<artsy>mauve|lilac|haze)'
)artsy,primary,funky,primary
... and setting MaxGroups=1
:
=RegexMatchGroups('mauve,red,green','(?<funky>green|lilac)<?<primary>red|green|blue)(?<artsy>mauve|lilac|haze)'
,,,1)artsy,primary,funky
RegexMatchGroups( input, pattern [, options [, MaxMatches [, MaxGroups [, IncludeDuplicates [, GroupNamesTransformPattern [, GroupNamesTransformReplacement ] ] ] ] ] ] )
Search the input for matches of the pattern, return a comma delimited list of matching capture group names/numbers in capture group order within the pattern.
The difference between RegexMatchGroups()
and RegexGroupMatches()
is that the first reports the results in the order of matches in the input, while the latter reports the results in the order of the search pattern. Besides, as a convenience, it allows to do a match/replace on the returned group names.
input
, pattern
, options
, MaxMatches
, and IncludeDuplicates
parameters are the same as for RegexMatchGroups()
.MaxGroups
: Maximum total number of group names or numbers to return (omit or 0 for all groups).GroupNamesTransformPattern
: Transform group names on output list, regex pattern for searchGroupNamesTransformReplacement
: Transform group names on output list, regex replacement patternTo Do: Include examples from CalIns project showing rule precedence
RegexReplace( input, pattern [, options [, replacement ] ] )
Finds all the instances of the search pattern in the input text, optionally modified with the option flags, replaces them with the replacement pattern, and returns the modifed input. Similar to RegexMatch()
but searching and replacing the entire input, and the replacement
pattern default is an empty string ("").
The parameters, and options for RegexReplace() are the same as for RegexMatch(), except for the replacement
default.