Open gopherbot opened 11 years ago
CL https://golang.org/cl/106360043 mentions this issue.
I just hit this issue as well. Does "Unplanned" mean this is unlikely to get worked on?
I'm also including some information on my use-case, in case that helps.
I'm trying to transformed loglines containing key-value pairs, to redact any string values. So for example:
name: "Joe", last_name: "Bloggs", age: 5, nickname: "Jogs" }
might become:
name: "SOME_HASH", last_name: "SOME_HASH", age: 5, $comment: "do not redact me", nickname: "SOME_HASH" }
I only want to target quoted strings that are followed by either ,
(comma) or }
(closing curly-braces), and I also want to ignore any $comment
fields.
I know that Go's regexp doesn't have lookahead/lookbehinds, which means I can't check for the above. using those. That restricts me somewhat. However, I figured I'd just capture everything using a regex like this:
quoted_string_regex, _ := regex.Compile(`(\$comment: )?"([^"]*)"[,| }]`)
and then check the actual subgroups to see if $comment was there, and also grab out the comma or curly-brace, and put that back on at the end.
However, I'm using ReplaceAllStringFunc
which only gives you the entire match - so it seem like I either need to do a second regex inside my callback function, or I need to do a bunch of contains/splits/ends-with etc.
(Obviously, if I've missed something obvious that is available in Go, please feel free to correct the above).
Does "Unplanned" mean this is unlikely to get worked on?
Unplanned just means that this won't potentially block a release. I know that @michaelmatloob has been looking at regexp stuff recently; perhaps he is interested.
Just wanted to add that I hit the very same issue today. I was trying to implement a simple tag replacement, e.g.
Name: {name}
First name: {firstname}
becomes
Name: Doe
First name: Jon
I'm coming from a Perl background; my first intuition was using a regexp like /{([^}]+)}/. Note the submatch in parentheses: In Perl, it would be possible to use replace (and call a function on the submatch) or use split (and get the submatches returned). In Go, split never returns the part that matches, and ReplaceAllStringFunc will return the complete string instead of just the submatch.
I'm not planning on working on this. If you're interested in contributing this, feel free to do so, but note that the freeze will start in a few days.
Is this issue solved by Regexp.Expand and Regexp.ExpandString?
@AlekSi I guess not, at least not in a straightforward way. The number of variables in the expand template is limited, whereas the number of matches in a string isn't.
I came across this post by Elliot Chance, it solved a JavaScript to Go porting problem I was having (for consistency it would be nice if it was incorporated as a new method in the Go regexp package):
http://elliot.land/post/go-replace-string-with-regular-expression-callback
Gist here: https://gist.github.com/elliotchance/d419395aa776d632d897
Thanks for the link @srackham - I hit exactly the same problem with trying to port something from JavaScript to Go. It would definitely be nice to see this functionality inside the standard regexp
package.
I also found another project which appears to implement similar functionality in perhaps a cleaner way because it replaces the default regexp
: https://github.com/agext/regexp
This gives some idea of how the solution could look: https://github.com/agext/regexp/blob/master/agext.go#L105
Here is a snippet for anyone else looking for a way to replace submatches with a function using bytes (not strings) and without having to deal with intermediate (non-captured) data: https://gist.github.com/slimsag/14c66b88633bd52b7fa710349e4c6749
I have the same problem.
I would use ReplaceAllStringFunc
but I also need submatches which lead to making an additional call to same regexp within the repl
function with FindAllStringSubmatch
.
I've met this issue today. I'm sure I've met it before, but I've probably used some tedious, bug-prone, workaround.
Hi,
A solution I use to solve this problem does two regexp matches: one for Replace
and another for Find
which is inefficient:
func main() {
str := "a: b, c: d"
re := regexp.MustCompile(`(\w+): (\w+)`)
transformString := func(s string) string {
m := re.FindStringSubmatch(s) // inefficiency: match again
k, v := m[1], m[2]
return fmt.Sprintf("%v: %v", strings.ToUpper(v), strings.ToUpper(k))
}
rpl := re.ReplaceAllStringFunc(str, transformString) // first match
fmt.Println(rpl) // B: A, D: C
}
The function ReplaceAllStringSubmatchFunc()
is missing from the regexp
package. With this function the code would look like:
func main() {
str := "a: b, c: d"
re := regexp.MustCompile(`(\w+): (\w+)`)
transformSubmatch := func(m []string) string {
k, v := m[1], m[2]
return fmt.Sprintf("%v: %v", strings.ToUpper(v), strings.ToUpper(k))
}
rpl := re.ReplaceAllStringSubmatchFunc(str, transformSubmatch) // new function
fmt.Println(rpl) // B: A, D: C
}
I'm looking forward for the ReplaceAllStringSubmatchFunc()
to be included into the regexp
package, as this situation is quite recurring.
Thank you!
by denys.seguret: