ijt / go-anytime

Parse natural and standardized dates/times and ranges in Go without knowing the format in advance
MIT License
23 stars 2 forks source link

`ReplaceTimesByFunc` takes 1.5ms on a longish string on an M1 Studio Mac #44

Closed ijt closed 1 year ago

ijt commented 1 year ago
func BenchmarkReplaceTimesByFunc(b *testing.B) {
    s := `'Twas brillig and the slithy toves did gyre and gimble in 50 wabe. All mimsy were the borogroves and the mome raths outgrabe. Beware the jabberwock my son, the jaws that bite January 2020 the claws that catch. Avoid the jubjub bird and shun the frumious bandersnatch. He took his vorpal sword in hand, longtime...`
    want := strings.ReplaceAll(s, "January 2020", "<time>")
    for i := 0; i < b.N; i++ {
        got, err := ReplaceTimesByFunc(s, now, func(t time.Time) string {
            return "<time>"
        })
        if err != nil {
            b.Fatal(err)
        }
        if got != want {
            b.Fatalf("\ngot  %q\nwant %q", got, want)
        }
    }
}

Output:

/private/var/folders/b7/h46kx5zx4m985db8s__9trdc0000gn/T/GoLand/___BenchmarkReplaceTimesByFunc_in_github_com_ijt_go_anytime.test -test.v -test.paniconexit0 -test.bench ^\QBenchmarkReplaceTimesByFunc\E$ -test.run ^$
goos: darwin
goarch: arm64
pkg: github.com/ijt/go-anytime
BenchmarkReplaceTimesByFunc
BenchmarkReplaceTimesByFunc-10           738       1495768 ns/op
PASS

This leaves a lot to be desired for heavy use.

ijt commented 1 year ago

I'm thinking about returning to a PEG-based parser. One way to quickly get a sense if it might be better would be to dig up the original version of this project that was based on that and see how the benchmark does with it.

ijt commented 1 year ago

I tried a quick and dirty benchmark on go-naturaldate and it reported 38usec/op (38440 ns/op), so about 39x faster.

func BenchmarkReplaceTimesByFunc(b *testing.B) {
    s := `'twas brillig and the slithy toves did gyre and gimble in 50 wabe. all mimsy were the borogroves and the mome raths outgrabe. beware the jabberwock my son, the jaws that bite january 2020 the claws that catch. avoid the jubjub bird and shun the frumious bandersnatch. he took his vorpal sword in hand, longtime...`
    want := strings.ReplaceAll(s, "January 2020", "<time>")
    for i := 0; i < b.N; i++ {
        got, err := ReplaceTimesByFunc(s, base, func(t time.Time) string {
            return "<time>"
        })
        if err != nil {
            b.Fatal(err)
        }
        if got != want {
            //b.Fatalf("\ngot  %q\nwant %q", got, want)
        }
    }
}

func ReplaceTimesByFunc(s string, ref time.Time, f func(t time.Time) string) (string, error) {
    p := &parser{
        Buffer:    strings.ToLower(s),
        direction: -1,
        t:         ref,
    }

    p.Init()

    if err := p.Parse(); err != nil {
        return "", err
    }

    p.Execute()

    // p.PrintSyntaxTree()

    strParts := make([]string, len(p.parts))
    for _, part := range p.parts {
        switch x := part.(type) {
        case string:
            strParts = append(strParts, x)
        case time.Time:
            strParts = append(strParts, f(x))
        }
    }

    return strings.Join(strParts, ""), nil
}

This is a huge improvement.

ijt commented 1 year ago

However, the PEG approach appears to require state mutations that are hard to work with compared to how we can bubble up expressions within goparsify.

Luckily, goyacc also allows an expression based approach. How fast is it, roughly?