nystudio107 / craft-seomatic

SEOmatic facilitates modern SEO best practices & implementation for Craft CMS 3. It is a turnkey SEO system that is comprehensive, powerful, and flexible.
https://nystudio107.com/plugins/seomatic
Other
162 stars 68 forks source link

URLs that contain `?token=` don't follow `X-Robots-Tag` behavior set by Craft and can become indexable #1394

Closed aaronbushnell closed 6 months ago

aaronbushnell commented 6 months ago

Describe the bug

In craftcms/cms#5698 Brandon implemented a rule that all tokenized requests have a header of X-Robots-Tag: none. I've had a couple sites where a client accidentally does what Brandon describes in that issue:

...Google wouldn’t ever see those URLs by default. My guess is that someone posted a tokenized URL someplace where Google crawled, and then it discovered the tokenized URLs from there.

To combat that, he implemented https://github.com/craftcms/cms/commit/a274b9127c64f9a11e5ea3f9b94f1b1691df58c7 which ensures any page using these ?token= URLs have X-Robots-Tag: none set.

However, when using SEOmatic, that behavior is overridden. If I disable the following line, the Craft behavior takes effect: https://github.com/nystudio107/craft-seomatic/blob/45b2e15596d969e0883778b073b4fa4c2ae11af7/src/helpers/DynamicMeta.php#L169

Could SEOmatic be modified to ensure this treatment of tokenized URLs is consistent with how Craft handles them? I can't think of a scenario where that would be undesirable.

To reproduce

Steps to reproduce the behaviour:

  1. Create a tokenized entry URL on a vanilla Craft install
  2. Visit the URL and click on another link that strips out the x-craft-preview query param (leaving token)
  3. View headers to see X-Robots-Tag: none
  4. Install SEOmatic
  5. Refresh the URL without the x-craft-preview, but with token
  6. See the headers default to the section/entry robot settings instead of forcing none

Expected behaviour

SEOmatic should treat URLs with ?token= the same way Craft does—by applying X-Robots-Tag: none (to prevent client whoopsies 🙃)

Versions

khalwat commented 6 months ago

Yeah, it sounds reasonable to me. I just need to see when ::getToken() was added to the Request object in Craft CMS.

Looks like it's Craft 3.2... easy enough :)

khalwat commented 6 months ago

Addressed in: https://github.com/nystudio107/craft-seomatic/commit/4ce3cb88ab1a358656e0420e3cc9d581b5c7423b & https://github.com/nystudio107/craft-seomatic/commit/c34b6d0b585abde52efaeac5e8ec7045c8f5772b

Craft CMS 3:

You can try it now by setting your semver in your composer.json to look like this:

    "nystudio107/craft-seomatic": "dev-develop as 3.4.69”,

Then do a composer clear-cache && composer update

…..

Craft CMS 4:

You can try it now by setting your semver in your composer.json to look like this:

    "nystudio107/craft-seomatic": "dev-develop-v4 as 4.0.38”,

Then do a composer clear-cache && composer update

aaronbushnell commented 6 months ago

Thanks, @khalwat!