Open vincentbernat opened 2 years ago
regular expressions can be compiled at compile-time instead of at executing time.
This is also really easy to do on user side: see ConstExpr
What about adding regexp()
builtin? It can be used something like this:
regexp("f.+").FindAllString(str, -1)
As an example, I have a function ClassifySite("something")
and a variant ClassifySiteRegex(Exporter.Name, "^([^-]+)-", "$1")
. To leverage ConstExpr, I would need the user to wrap the regex into a function, so it would be inconvenient.
As for the builtin you propose, I suppose it returns an array. I was hoping for something a little more magic as it would mean to write something like this:
ClassifySite(regexp("^([^-]+)-").FindAllString(Exporter.Name)[0])
And what happens if there is no match? I would prefer something like:
Exporter.Name matches "^([^-]+)-" && ClassifySite($1)
But I would understand that you don't like such magic variables as they make the language non-pure.
Exporter.Name matches "^([^-]+)-" && ClassifySite($1)
I actually like this idea! Neat! I think it’s understandable what is going on here.
Oh, great! It's how it is in Perl (I think, I don't remember exactly).
Sorry to bump the old thread, here is a way to do this by extending expr.
// Declare a global cache for compiled regex (optional).
// Proctect it using lock as concurrent reads/writes may happen.
var (
compiledRegex = make(map[string]*regexp.Regexp)
mutex sync.RWMutex
)
// This function may be called concurrently but it's local vars are safe.
func myFunc() {
// Build the env map
env := make(map[string]interface{})
// reMatch holds the captured groups by regex if any.
reMatch := make([]string, 0)
// reFind() holds the closure function with access to var "env". This access is needed as you can see below.
// reFind() returns true if succeesfully captured any groups. Else, false.
reFind := func(input string, pattern string) bool {
mutex.RLock()
regex, exists := compiledRegex[pattern]
mutex.RUnlock()
if !exists {
var err error
regex, err = regexp.Compile(pattern)
if err != nil {
log.Error("Regex compile error:", err)
return false
}
mutex.Lock()
compiledRegex[pattern] = regex
mutex.Unlock()
}
// we store the captured groups if any.
matches := regex.FindStringSubmatch(input)
// we overwrite our captured strings slice to env["reMatch"] so that we can access the matches like reMatch[0] inside the expression.
env["reMatch"] = &matches
if matches == nil {
return false
}
return true
}
// This is where we set the initial empty slice to env["reMatch"]. This can be overwritten by the reFind() later though.
env["reMatch"] = &reMatch
// we map the closure function reFind() to env["reFind"] so that it is accessible as reFind(input, 'regex_pattern') in the expression.
env["reFind"] = reFind
//Compile, cache it, and run. Or just run.
compiled, err := expr.Compile(exprString, expr.Env(env))
result, err := vm.Run(compiled, env)
}
Now the expression can be written like:
"reFind(input_string, '^(..)') ? reMatch[0] : 'unknown'"
reMatch
is overwritten once reFind()
is called. We may also call reFind()
multiple times in the same expression. Access what you want from reMatch
soon after each call.
Expr supports variables inside expressions now. They also can be used:
let matches = reFind(“…”); matches[0]
TFS, @antonmedv . Where could I have learnt about this? Any non-obvious document or code-sample that I should keep an eye on, to know of such updates? Thanks.
I post all changes to https://github.com/expr-lang/expr/releases But I guess a dedicated blog post for release changes will be nice to have: https://expr-lang.org/blog
Hi @antonmedv, I think adding regexp to the built-in is a great idea. Is there a plan for this?
if regexp("...")
function returns a *Regexp
object, then it allow users to utilize a variety of methods supported by the regexp package.
@amikai true. I'm thinking of adding this in the text release of the expr.
Hey!
It would be nice to be able to access the parts matching a regex when using the
matches
operator. The captured parts could be assigned to some variables ($1
,$2
, etc) or to a specialmatched
map indexed by the index and names of matched parts. All this could be done in a provided function, but putting it in the language allows one to get better performance as the regular expressions can be compiled at compile-time instead of at executing time.