Open RyanDool opened 5 months ago
Possible duplicate of https://github.com/az-digital/az_quickstart/issues/2357
Thank you @trackleft I'll touch base with @bberndt-uaz, since I see #2357 went into the 2.10.0 minor release.
Looking at the HTML for the both example pages linked in the description, I can see that the header on both pages already includes a canonical element pointing to the news.arizona.edu URL:
<link rel="canonical" href="https://news.arizona.edu/story/uarizona-leadership-presents-next-steps-financial-action-plan-focus-collaboration">
This Google documentation says:
While we encourage you to use these methods [of specifying a canonical preference], none of them are required; your site will likely do just fine without specifying a canonical preference. That's because if you don't specify a canonical URL, Google will identify which version of the URL is objectively the best version to show to users in Search.
Another method which that documentation recommends is including the canonical URL in a sitemap. The example news story is already included in news.arizona.edu's sitemap and the imported news story on arizona.edu is NOT included in arizona.edu's sitemap. I believe flexible pages are the only content type included in the sitemap by default in Quickstart.
Finally, I searched for the story on Google and I only see the news.arizona.edu link in the first ten results:
It seems to me that the imported news nodes are not causing an SEO issue, but I'm curious what others think.
Problem/Motivation
This issue was discovered on arizona.edu, where the news importer is creating an accessible node and url on arizona.edu which could be seen as duplicate content as it matches exactly as what is posted on the news site.
Describe the bug
When a news story is imported a node is created on the subdomain with the same title, subtitle, image and page url as what exists on news.arizona.edu. An example can be seen here: Original Story posted on news.arizona.edu vs Imported Story
Proposed resolution
Resolutions include: adding a canonical tag to the node upon creation which points to the source of truth (news.arizona.edu/story/[article title]), making the node is inaccessible to users and bots.