temoto / robotstxt

The robots.txt exclusion protocol implementation for Go language
MIT License
269 stars 55 forks source link

Add support for Crawl-Delay #22

Closed moredure closed 5 years ago

moredure commented 5 years ago

Just a feature request

temoto commented 5 years ago

Do you mean add documentation or existing support doesn't work as expected?

moredure commented 5 years ago

never mind, just found CrawlDelay as a struct variable

moredure commented 5 years ago

How about providing support for extended format Request-rate: 1/5
Visit-time: 0600-0845

temoto commented 5 years ago

Yes, of course, will add those, if nobody beats me to it. (wink wink)

Does your crawler use request rate?

moredure commented 3 years ago

Yes, my crawler using mercator scheme and doing pauses Nth times of request duration or crawl-delay what is bigger of course no more than some smart amount of time. I'll check some internet wide statistics maybe it worth considering implementation.