temoto / robotstxt

The robots.txt exclusion protocol implementation for Go language
MIT License
269 stars 55 forks source link

Speeding up Parse function #17

Closed nielsole closed 7 years ago

nielsole commented 7 years ago

Thanks for this great software! I used it in a crawler I wrote and had a performance bottleneck in the parse function which pegged the CPU.

You can see a comparison before: https://files.niels-ole.com/graph.svg and after: https://files.niels-ole.com/graph1.svg

If you look at the changed function you can see a performance speedup by 4X (100.10s -> 22.79)*

I am relatively new to golang so beware ;).

Source: http://stackoverflow.com/questions/1760757/how-to-efficiently-concatenate-strings-in-go

temoto commented 7 years ago

Thanks. Shame on me for such lame code. I've modified your patch with tok.WriteRune -- that gave even better performance, commit author is preserved.

Did you know we have parsing speed benchmarks in test suite? go test -bench=. to run them.

nielsole commented 7 years ago

Thanks for responding so quickly. I wasn't aware of WriteRune, good catch.