Open sybrew opened 7 years ago
how about multisite support? Different robots.txt for different domains.
Multisite support is fundamental and by default for all extensions unless otherwise stated. That's why I noted that it will only work when no static file is present π.
Also, all sites I own are on a Multisite network so you don't have to worry about that!
so I can generate robots.txt different for any site in my network? How to do it? Is where any manual?
All open issues are just a draft for now, there's no operational code yet. This includes this issue.
The idea is that it will be different for each site in the network, yes.
ΠΠ, thanks @sybrew
At the moment, it's not possible to add a Disallow in robots.txt via the plugin, right?
Hi @trainoasis
That's correct. You'd want to use a WordPress filter, at priority >10, instead:
add_filter( 'robots_txt', function( $robots ) {
$my_robots = <<<'MYROBOTS'
User-agent: some-bot
Disallow: /
MYROBOTS;
return $my_robots . $robots;
}, 11 );
Hey @sybrew I need to manually a disallow code in the robots.txt, but I figured out that the plugin currently does not allow that and there is no robots.txt in the root folder for me to edit. So can you tell me how to add it? If I have to use the above WordPress filter, then where do I add the filter? In functions.php? or somewhere else. sorry, it may sound silly but I googled and could not find anything reliable.
Hi @chandlerbing26
You can do either of the following:
robots.txt
file to the root of your website anyway; then you'll have complete control over its contents. This is probably your best bet, but it does not translate well with WordPress Multisite's domain mapping (a corner case).functions.php
file. See https://tsf.fyi/docs/filters#where to learn about alternative methods.hi @sybrew thanks I added a robots.txt in my root folder and that worked. Thanks for replying. Maybe add a robots.txt editor maybe in upcoming versionsππ
Hi @sybrew, I recently added the Blackhole for Bad Bots plugin by Jeff Starr and I must add some lines with a directive on the robots.txt
I remember with Yoast o others, I had my robots.txt on the mail path html_public directory, but now, with The SEO Framework, the robots.txt is added dynamically and I don't know how to edit manually..
Any suggestions? How could I add an small directive like the following to send to Bing crawler or Google spiders?
User-agent: * Disallow: /?blackhole
Hi @chandlerbing26
You can do either of the following:
- Add a
robots.txt
file to the root of your website anyway; then you'll have complete control over its contents. This is probably your best bet, but it does not translate well with WordPress Multisite's domain mapping (a corner case).- Add to or overwrite our filters as you described. Yes, that can be added to the
functions.php
file. See https://tsf.fyi/docs/filters#where to learn about alternative methods.
If we have the dynamically created by The SEO Framework plugin robots.txt and another one created manually by us, what of them should we add to Google/Bing Webmaster Tools?
Hi @vir-gomez,
When there's a static robots.txt file in the root folder of your website, the virtual "file" cannot be outputted. So, with a robots.txt file present, The SEO Framework's output won't work.
The virtual robots.txt "file" will look a bit like this.
Now, the robots.txt file may just as well be empty because there are many other signals utilized to steer robots away from administrative and duplicated pages. Like the X-Robots-Tag
HTTP header and the <meta name=robots />
HTML tag. So, feel free to use a custom robots.txt file with the blackhole directive in place.
P.S. Please send us future requests via our WordPress.org support forums. This issue is about a feature proposal, not a support topic.
From https://github.com/sybrew/the-seo-framework/issues/647: Add more directives for AI-blocking, including opt-out for "Google-Extended" -- see https://developers.google.com/search/docs/crawling-indexing/overview-google-crawlers#user-agents-in-robots.txt.
Feature: Add an easy interface (like cPanel's DNS editor) to manage robots.txt "file" output. Only to be working when no static file is present. i.e. Only by the use of filters.
I do not want to over-write files nor leave permanent marks.
Planned as a free extension π.