dlclark / regexp2

A full-featured regex engine in pure Go based on the .NET engine
MIT License
997 stars 84 forks source link

The best way to get all named captured groups #25

Closed thangld322 closed 4 years ago

thangld322 commented 4 years ago

I'm trying to use this library to get all the named captured groups to a map[string]string. This is my code:

caps := make(map[string]string)
re, err := regexp2.Compile(pattern, regexp2.RE2)
if err != nil {
    panic(err)
}
names := re.GetGroupNames()
mat, err := re.FindStringMatch(text)
if err != nil {
    panic(err)
}
if mat != nil {
    gps := mat.Groups()
    for i, value := range names {
        if value != strconv.Itoa(i) {
            if len(gps[i].Captures) > 0 {
                caps[value] = gps[i].Captures[0].String()
            }
        }
    }

    fmt.Println(caps)
}

Is this the best way in term of performance to do it? First it calls FindStringMatch(), then it calls Groups() and finally, a for loop. Seem a little too many jobs to do. :D

dlclark commented 4 years ago

If you want to support multiple matches you could also filter the names first before checking the captures so you don't filter it each time. Other than that it would appear this is the best current method to get a map[string]string of only named groups and storing the first capture.

If you need this in a super-hot code path you could make an optimized fork to use the internal Regexp.capnames map to skip a lot of the logic since that map only contains named groups (no filtering needed). The keys would be exactly what you wanted and the values could be used to call mat.GroupByNumber()

Sorry this took so long to reply, it just fell off my radar and I didn't notice it until today.