Closed moredure closed 3 years ago
In practice TrimRight using TrimRightFunc internally, just no allocation, replacer also should allocate less one time string building vs two. Also we can use EqualFold to reduce one more memory allocation of strings.ToLower in parseLine
function, I guess Disallow/Allow/Crawl-Delay tokens very frequently used in camelcase, so it will be exactly one allocation less in this cases in practice such tokens might appear more than once per robots.txt across the internet so -4-6 allocations per document.
BenchmarkFromBytes/One_Replacer
BenchmarkFromBytes/One_Replacer-4 7473553 149 ns/op 48 B/op 3 allocs/op
BenchmarkFromBytes/Two_strings.Replace
BenchmarkFromBytes/Two_strings.Replace-4 8631030 139 ns/op 24 B/op 4 allocs/op
Maybe replacer a little bit slower, but one allocation less. But with larger lines faster:
BenchmarkFromBytes/One_Replacer
BenchmarkFromBytes/One_Replacer-4 6257397 180 ns/op 96 B/op 3 allocs/op
BenchmarkFromBytes/Two_strings.Replace
BenchmarkFromBytes/Two_strings.Replace-4 6205563 196 ns/op 128 B/op 4 allocs/op
TrimRight vs TrimRightFunc:
BenchmarkTrimFunc/TrimRightFunc
BenchmarkTrimFunc/TrimRightFunc-4 90882796 12.4 ns/op 0 B/op 0 allocs/op
BenchmarkTrimFunc/TrimRight
BenchmarkTrimFunc/TrimRight-4 24250952 46.0 ns/op 32 B/op 1 allocs/op
Check for yourself:
package robotstxt
import (
"strings"
"testing"
)
var replacer = strings.NewReplacer(`\*`, `.*`, `\$`, `$`)
func star(r rune) bool {
return r == '*'
}
func BenchmarkReplacer(b *testing.B) {
t2 := `sdfsd \*\$\* sdfsdf`
b.Run("One Replacer", func(b *testing.B) {
b.ReportAllocs()
for i := 0; i < b.N; i ++ {
replacer.Replace(t2)
}
})
b.Run("Two strings.Replace", func(b *testing.B) {
b.ReportAllocs()
for i := 0; i < b.N; i ++ {
strings.Replace(strings.Replace(t2, `\*`, `.*`, -1), `\$`, `$`, -1)
}
})
}
func BenchmarkTrimFunc(b *testing.B) {
b.Run("TrimRightFunc", func(b *testing.B) {
b.ReportAllocs()
for i := 0; i < b.N; i ++ {
strings.TrimRightFunc("sdfsdffsdfsdf*", star)
}
})
b.Run("TrimRight", func(b *testing.B) {
b.ReportAllocs()
for i := 0; i < b.N; i ++ {
strings.TrimRight("sdfsdffsdfsdf*", "*")
}
})
}
func BenchmarkEqualFold(b *testing.B) {
t := `Disallow`
b.Run("EqualFold inside switch", func(b *testing.B) {
b.ReportAllocs()
for i := 0; i < b.N; i ++ {
switch {
case strings.EqualFold(t, "disallow"):
case strings.EqualFold(t, "allow"):
case strings.EqualFold(t, "useragent"):
case strings.EqualFold(t, "user-agent"):
case strings.EqualFold(t, "crawl-delay"):
case strings.EqualFold(t, "crawldelay"):
}
}
})
b.Run("To Lower then comare in switch", func(b *testing.B) {
b.ReportAllocs()
for i := 0; i < b.N; i ++ {
switch strings.ToLower(t) {
case "disallow":
case "allow":
case "useragent":
case "user-agent":
case "crawl-delay":
case "crawldelay":
}
}
})
}
In case of equal fold
BenchmarkEqualFold
BenchmarkEqualFold/EqualFold_inside_switch
BenchmarkEqualFold/EqualFold_inside_switch-4 56437368 19.2 ns/op 0 B/op 0 allocs/op
BenchmarkEqualFold/To_Lower_then_comare_in_switch
BenchmarkEqualFold/To_Lower_then_comare_in_switch-4 30694930 37.3 ns/op 8 B/op 1 allocs/op
One less allocation and 2 times faster.
Merging #27 (99dc19f) into master (87d22cd) will decrease coverage by
0.82%
. The diff coverage is100.00%
.
@@ Coverage Diff @@
## master #27 +/- ##
==========================================
- Coverage 92.07% 91.25% -0.83%
==========================================
Files 3 3
Lines 303 263 -40
==========================================
- Hits 279 240 -39
+ Misses 13 12 -1
Partials 11 11
Impacted Files | Coverage Δ | |
---|---|---|
robotstxt.go | 92.85% <100.00%> (-1.20%) |
:arrow_down: |
scanner.go | 90.78% <0.00%> (-0.70%) |
:arrow_down: |
parser.go | 90.59% <0.00%> (-0.61%) |
:arrow_down: |
Continue to review full report at Codecov.
Legend - Click here to learn more
Δ = absolute <relative> (impact)
,ø = not affected
,? = missing data
Powered by Codecov. Last update 87d22cd...99dc19f. Read the comment docs.
Looks like for replacer benchmark is worser when no symbols to replace (no asterisks, etc), I'll revert this changes
In worst case scenario (last switch case) equal-fold might be a little bit slower, let's take a look in benchmarks in travis.
Check for yourself:
package robotstxt func BenchmarkReplacer(b *testing.B) { func BenchmarkTrimFunc(b *testing.B) { func BenchmarkEqualFold(b *testing.B) {
This should be in patch.
Let's start from small changes)
@temoto
replacer/asterisk is very confusing change. If this is a draft work, you can mark pull request as draft with a button here.