-
Example for forbes.com robots txt
https://www.forbes.com/robots.txt
They have blocked all paths for `GPTBot`
```
User-agent: GPTBot
Disallow: /
```
However for url `https://www.forbes.c…
-
I have a robots.txt file containing two lines:
```
User-agent: *
Disallow: /
```
And used the example shown on the documentation:
```
var robots = require('robots')
, parser = new robots.RobotsPa…
-
Would be nice to have the ability to parse robots.txt like RSS feeds. `$web->robots`
https://github.com/bopoda/robots-txt-parser is a library. Not sure if it is the one to use here but it seems to …
-
Hi, author of the [robots_txt](https://pub.dev/packages/robots_txt) package here. I thought that, since `linkcheck` currently features its own implementation of a parser for `robots.txt`, it would be …
-
When using preg_match('@...@'), preg_quote($rule, '@') is expected to be used to escape input.
Currently one of the following warnings occurs when a path contains some meta character:
PHP Warning: p…
-
Hi,
Hope you are all well ! And merry Christmas first of all !
I was playing to today with site-audit-seo and I was missing some features like a robots.txt parser to find available sitemaps for …
ghost updated
3 years ago
-
Hi @samclarke ,
I have a script to watch multiple `robots.txt` from websites but in some case they have none but still display a fallback content. The issue is your library will tell `isAllowed() -…
sneko updated
6 months ago
-
Currently we have different fetchers and (effectively) different parsers for robots.txt, sitemap, and regular URLs. This isn't very clean, and duplicates code. So an alternative approach is to have a …
-
Via a report from the web site owner, we have found that the crawler appears to be ignoring robots.txt. The instructions at https://www.ukmodelshops.co.uk/robots.txt disallow access to `/form/...` but…
-
**What is the current behavior?**
The `Crawl-Delay` is ignored.
**What is the expected behavior?**
The `Crawl-Delay` should be honored, it can be retrieved using `getCrawlDelay()` on the robo…