Open piotrpog opened 5 years ago
@piotrpog Hmm, that's new to me. Do you have a site that's currently exhibiting this behaviour you could link to? Is the header on all pages or just sitemap pages?
@alexjcollins It is on all pages, not only these in sitemap. For a moment i cannot link to website because i disabled plugin to make my website appear in google again.
@alexjcollins We're also having this issue, you can see it on the header - https://jonesfoster.com
Our hosting provider came back with:
# grep -rnw /xxx/xxx-e 'X-Robots-Tag'
/xxx/vendor/ether/seo/src/services/SeoService.php:18: * Adds the `X-Robots-Tag` header to the request if needed.
So something on that line is adding it on. SEO Plugin is 3.4.4, but can't update to latest just yet.
I've updated to Craft 3.3.3 and SEO 3.6.2 and the problem is still there.
Just for clarification, this our client pointed this out when they tried to put their site in to Google Search Console and it wouldn't index any pages.
EDIT: It seems it only applies when dev mode is on! 🤦♂️Case closed... But might be worth mentioning this in docs.
I had the same issue. We changed production to devmode true just to see quickly what the exact error was. We changed back to devMode false, and cleared the cache. Somehow however this did not remove the header. Unfortunately we did not notice this, our client experienced a drop in ranking and notified us.
It may be safer to not switch on devMode but perhaps on environment settings, anything less then 'production' perhaps.
it looks like i have the same problem: site was on dev mode, then changed to production via .env file. But: robots.txt is still set to User-agent: * Disallow: /
how did you manage to apply the change to prod ?
ps: site is https://www.anis.ch
@puck3000 Is the site definitely in production
mode?
Also, what does your system Robots setting look like? Here's the default for reference:
@alexjcollins thank you for caring ;-) yes, the site definitely is in production, the "ENVIRONMENT" Variable in .env ist set to "production". and the robots settings are untouched and look the same as on your screenshot.
@puck3000 Thanks for the reply.
Okay, that's really strange – if the robots settings are identical, you should have a sitemap
reference at the top of your robots.txt
file.
Is there any chance that you already have a physical robots.txt
file in /web
that could be overriding the plugin generated version?
hy alex I checked and no, there's no robots.txt file. Then I tried to add one, and strangely, even if I add a "phisical" robots.txt to the web root, I still see
User-agent: * Disallow: / on anis.ch/robots.txt ... When I place another file, like text.txt in the web root, it just works as it should.
Is there any other place, where this "wrong" robots.txt could be generated?
On Mai 28 2020, at 8:32 am, Alex Collins notifications@github.com wrote:
@puck3000 (https://github.com/puck3000) Thanks for the reply. Okay, that's really strange – if the robots settings are identical, you should have a sitemap reference at the top of your robots.txt file. Is there any chance that you already have a physical robots.txt file in /web that could be overriding the plugin generated version? — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub (https://github.com/ethercreative/seo/issues/244#issuecomment-635135522), or unsubscribe (https://github.com/notifications/unsubscribe-auth/AAZ2M3IFJQ53TQNO3VKM6QLRTYAONANCNFSM4IPJSNPA).
@puck3000 When in production mode, do you have devMode
set to true
in config/general.php
?
no, it is only set in dev mode:
<?php
/**
* General Configuration
*
* All of your system's general configuration settings go in here. You can see a
* list of the available settings in vendor/craftcms/cms/src/config/GeneralConfig.php.
*
* @see \craft\config\GeneralConfig
*/
return [
// Global settings
'*' => [
// Default Week Start Day (0 = Sunday, 1 = Monday...)
'defaultWeekStartDay' => 1,
// Whether generated URLs should omit "index.php"
'omitScriptNameInUrls' => true,
// Control Panel trigger word
'cpTrigger' => 'admin',
// The secure key Craft will use for hashing and encrypting data
'securityKey' => getenv('SECURITY_KEY'),
// Whether to save the project config out to config/project.yaml
// (see https://docs.craftcms.com/v3/project-config.html)
'useProjectConfigFile' => false,
],
// Dev environment settings
'dev' => [
// Dev Mode (see https://craftcms.com/guides/what-dev-mode-does)
'devMode' => true,
],
// Staging environment settings
'staging' => [
// Set this to `false` to prevent administrative changes from being made on staging
'allowAdminChanges' => true,
],
// Production environment settings
'production' => [
// Set this to `false` to prevent administrative changes from being made on production
'allowAdminChanges' => true,
],
];
should I set it explicitly to false in production?
On Mai 28 2020, at 12:36 pm, Alex Collins notifications@github.com wrote:
@puck3000 (https://github.com/puck3000) When in production mode, do you have devMode set to true in config/general.php? — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub (https://github.com/ethercreative/seo/issues/244#issuecomment-635260483), or unsubscribe (https://github.com/notifications/unsubscribe-auth/AAZ2M3NZUPQLUVRFRP3TT53RTY5EHANCNFSM4IPJSNPA).
@puck3000 Might be worth giving it a go, although I'm pretty sure it'll be false by default.
@alexjcollins sadly you where right, setting it explicitly didn't change anything...
@puck3000 It’s a big ask, but is there any possibility of sending over your site files and a database dump?
If you can, please could you send to alex@ethercreative.co.uk
Possibly having the same issue with X-Robots-Tag: none, noimageindex being applied, but cannot find the source. Is there something I can lookup/change to remove the tag?
I had the same issue. ENVIRONMENT="production" However, if devMode="true" in general.php it activated the X-Robots-Tag!!!
I ran into this issue while migrating content and devMode was true in production while I was troubleshooting and left it on in case something came up.
This resulted in about 60 important pages being unindexed over a couple of days.
I tested with https://search.google.com/search-console in both settings and found this to be the culprit.
I'm now wary, but it would be nice to have the option to ignore the X-Robots-Tag based in the general.php config settings.. Was this a holdover from CraftCMS 2?
I've also discovered this bug, for me it was because I had ENVIRONMENT=live
instead of ENVIRONMENT=production
. That's a pretty severe bug for an SEO plugin to have.
Facing this issue as well.. I used to have dev
, staging
and prod
instead.. this plugin literally checks on production
. Not good.
src/services/SeoService.php :: 26
if (CRAFT_ENVIRONMENT !== 'production')
{
$headers->set('x-robots-tag', 'none, noimageindex');
return;
}
Neither a plugin of Craft itself should do this hard-coded.
Thank you @bertoost !
Was trying to look for a solution where Google was reporting the x-robots-tag was stopping a client site from being crawled. Our env was set to prod
Even removing
if (CRAFT_ENVIRONMENT !== 'production')
{
$headers->set('x-robots-tag', 'none, noimageindex');
return;
}
is not solving the issue. x-robots-tag is then set to none
Still got that issue with Craft CMS 4 version.
// services/SeoService.php
$env = getenv('ENVIRONMENT') ?? getenv('CRAFT_ENVIRONMENT');
If I use CRAFT_ENVIRONMENT and not ENVIRONMENT $env return false form the line above. So the header is set to block robots.
If I include both (which is rediculous), as above, now it works.
ENVIRONMENT=production
CRAFT_ENVIRONMENT=production
Why that condition is not based on devMode
or disallowRobots
Craft config settings? That way you don't have to include specific environmental settings that may be different from one dev to another.
Still got that issue with Craft CMS 4 version.
// services/SeoService.php $env = getenv('ENVIRONMENT') ?? getenv('CRAFT_ENVIRONMENT');
If I use CRAFT_ENVIRONMENT and not ENVIRONMENT $env return false form the line above. So the header is set to block robots.
If I include both (which is rediculous), as above, now it works.
ENVIRONMENT=production CRAFT_ENVIRONMENT=production
Why that condition is not based on
devMode
ordisallowRobots
Craft config settings? That way you don't have to include specific environmental settings that may be different from one dev to another.
You just saved my life. Adding ENVIRONMENT=production on top of CRAFT_ENVIRONMENT=production fixed it for me.
@pascalminator you're welcome! Still, I would like a follow up from the creators 😆
I'm still having this issue.. the pages are not being found. Pretty big issue! Those are my configs:
.env
ENVIRONMENT=production CRAFT_ENVIRONMENT=production
config/general.php
Robots settings inside SEO plugin:
Running all of this on: Craft CMS: v4.2.3 ether/seo: v4.0.3
When surfing to my domain.com/robots.txt I still get this:
User-agent: * Disallow: /cpresources/ Disallow: /vendor/ Disallow: /.env
@SkermBE
The URL next to "Referring Page" in your page indexing screenshot is indeed blocking Googlebot:
User-agent: Googlebot
Disallow: /?*
User-agent: Baiduspider
Disallow: /?*
User-agent: YandexBot
Disallow: /?*
User-agent: ichiro
Disallow: /?*
User-agent: sogou spider
Disallow: /?*
User-agent: Sosospider
Disallow: /?*
User-agent: YoudaoBot
Disallow: /?*
User-agent: YetiBot
Disallow: /?*
User-agent: bingbot
Crawl-delay: 2
Disallow: /?*
User-Agent: Yahoo! Slurp
Crawl-delay: 2
Disallow: /?*
User-agent: rdfbot
Disallow: /?*
User-agent: Seznambot
Request-rate: 1/2s
Disallow: /?*
User-agent: ia_archiver
Disallow:
User-agent: Mediapartners-Google
Disallow:
Is this the correct domain?
When surfing to my domain.com/robots.txt I still get this: User-agent: * Disallow: /cpresources/ Disallow: /vendor/ Disallow: /.env
The SEO settings screenshot show this is correct as you're in production
mode
@jamiematrix
I know nothing about thad referring page.. When i surf to my own domain (witch is not 4rank.bid or some weird thing) I get the robots.txt like mentioned.
User-agent: *
Disallow: /cpresources/
Disallow: /vendor/
Disallow: /.env
But still Google search console is saying it's being blocked. When doing a live test now, this is the result:
Seems to be fixed with: https://github.com/ethercreative/seo/issues/432
Just got mega burned by this. Thanks to those who found and submitted PRs.
Here is hotfix for the template, but it does not solve the bug, just suppresses the symptoms:
{% if craft.app.config.env == 'production' %}
{% header "X-Robots-Tag: all" %}
{% else %}
{% header "X-Robots-Tag: noindex, nofollow, none" %}
{% endif %}
Got hit with this too the other day, again.
Bug ist still there. Thank you @jesuismaxime!
sorry ether, but the level of support for this plugin is starting to get ridiculous. We also have big sites hit by this issue, also, AGAIN. Much developers offered you money or support or making PR's. But it seems you leave us in the dark here.
Same issue here using Craft 4 ( latest ) and SEO plugin 4.0.3 ( latest ), Google not indexing any page because of x-robots-tag: none, noimageindex.
CRAFT_ENVIRONMENT=production
DISALLOW_ROBOTS=false
DEV_MODE=false
This could ruin any website SEO strategy. Using latest version, no solution for this yet??
In this case
Fixed with by adding "ENVIRONMENT" variable https://github.com/ethercreative/seo/issues/432
CRAFT_ENVIRONMENT=production ENVIRONMENT=production
The plugin is now labeled as no longer maintained https://plugins.craftcms.com/seo
I'd recommend SEOMate.
The plugin is now labeled as no longer maintained https://plugins.craftcms.com/seo
I'd recommend SEOMate.
Seems like they are resuming the work on the plugin: https://github.com/ethercreative/seo/issues/447#issuecomment-1498974519
I installed this plugin only to have sitemap functionality. But i recently noticed that it attaches
X-Robots-Tag: none, noimageindex
http header to all sites by default.Why is that? Can I fix it somehow?