issues
search
laurentprudhon
/
nlptextdoc
Suite of tools to extract and annotate language resources for NLP applications
Other
1
stars
2
forks
source link
Robots.txt directives aren't followed as they should
#7
Closed
laurentprudhon
closed
5 years ago
laurentprudhon
commented
5 years ago
Example of
https://www.cbanque.com/robots.txt
:
Urls starting with /forums/members/ are crawled and produce 403 errors
Crawl-delay: 5 is not enforced and all requests return with 403 errors after the first 350 requests
laurentprudhon
commented
5 years ago
Fixed
Example of https://www.cbanque.com/robots.txt :