Details

Resource: https://yoast.com/ultimate-guide-robots-txt/

Robots.txt syntax

robots.txt file consists of one or more blocks of directives, each starting with a user-agent line. The “user-agent” is the name of the specific spider it addresses. You can have one block for all search engines, using a wildcard for the user-agent, or particular blocks for particular search engines. A search engine spider will always pick the block that best matches its name.

These blocks look like this (don’t be scared, we’ll explain below):

User-agent: * 
Disallow: / 

User-agent: Googlebot 
Disallow: 

User-agent: bingbot 
Disallow: /not-for-bing/

Directives like Allow and Disallow should not be case-sensitive, so it’s up to you to write them in lowercase or capitalize them. The values are case-sensitive, so /photo/ is not the same as /Photo/. We like capitalizing directives because it makes the file easier (for humans) to read.

R-Sandor / FindFirst

[Server] Before initaiting the scrape check robot.txt on the domain. #236

Details