adobe / aem-core-wcm-components

Standardized components to build websites with AEM.
https://docs.adobe.com/content/help/en/experience-manager-core-components/using/introduction.html
Apache License 2.0
740 stars 747 forks source link

Support sitemaps for WCM Core Components Pages #1697

Closed bpauli closed 3 years ago

bpauli commented 3 years ago

As a business user I want to enable a sitemap for a page so that I can add this sitemap to the Google Search Console (or alike) and/or any robots.txt of my site in order to notify a crawler about all pages of my site.

Deliverables

Acceptance criteria

 

bpauli commented 3 years ago

fixed in #1647

davidjgonzalez commented 2 years ago

@bpauli This is listed as being released in 2.17.6 - but im not seeing (many?) of these changes in the master branch anymore. Were these removed? Never released in 2.17.6? Im confused as to what the status of this feature is in latest Core Cmps.

kwin commented 2 years ago

I think most low-level implementation is in https://github.com/apache/sling-org-apache-sling-sitemap. In addition there is the higher level implementation in bundle com.adobe.aem.wcm.seo. Unfortunately the latter is not really documented, maybe @Buuhuu has more insights. Also I am not sure whether those bundles ship with AEM 6.5.12 or only with AEMaaCS...

buuhuu commented 2 years ago

@kwin is right. The AEM specific implementation was side ported to 6.5 with SP11.

@kwin some documentation can be found here https://experienceleague.adobe.com/docs/experience-manager-cloud-service/content/overview/seo-and-url-management.html?lang=en#building-an-xml-sitemap-on-aem. What exactly do you think is missing?

kwin commented 2 years ago

Core WCM Components claim to be compatible with 6.5.10 and I don't find any links or hints in https://github.com/adobe/aem-core-wcm-components/tree/main/content/src/content/jcr_root/apps/core/wcm/components/page/v3/page#page-v3 around the sitemap support or any limitations. Would be good to extend the readme in that regard.

davidjgonzalez commented 2 years ago

So Core Components are not required at all to support any facet of Sitemaps that were added in 2.7.16 - in AEM 6.5.11+ or AEM CS? (i see there are still some methods from the Sitemaps PR in Core Components master, but not sure if they're required). What version of Core Comps was this code removed - it didn't jump out in a quick review of the history.

I'd echo @kwin's concern - unless I'm missing something - it seems like this is a breaking change in CC for users of 6.5.10.

buuhuu commented 2 years ago

It is not a breaking change for users of 6.5.10 as the com.adobe.aem.wcm.seo bundle is wired optionally and if not available the Page Component falls back to original implementation where applicable, for example for the canonical link

Yes, technically speaking, the core components are not required for Sitemaps to work. However, there are some parts of the WCM Core Components Page that must be in-line with the data in the Sitemap, in particular canonical links, alternate language links and the noindex behaviour. After all the rendered pages should not send different signals to crawlers than the sitemap does.

Afaik there has never been anything "removed" from the Core Components related to this area.

davidjgonzalez commented 2 years ago

@Buuhuu so a 6.5.10 customer on CC 2.17.6 will see no change in behavior (regardless of how they configured CC 2.17.6 sitemaps) when they upgrade to <whatever CC version removes this>?

buuhuu commented 2 years ago

As said earlier, there has nothing been removed form the CC wrt to Sitemaps. Let me summarise the compatibility:

AEM CC Sitemap Canonical Link Robots Tags Alternative Language Links
any <2.17.6 custom / ACS custom custom custom
<= 6.5 SP10 2.17.6+ custom / ACS using Link#getExternalizedURL() custom custom
>= 6.5 SP11 2.17.6+ Apache Sling Sitemap / Sites SEO Sites SEO Sites SEO Sites SEO

Technically the CC are not necessary to use Sites SEO Sitemaps with 6.5 SP11 and newer for most cases.

The only thing that the CC provide to the Sitemaps is part of the logic used to find the alternative language links. Using the LanguageNavigationSiteRootSelectionStrategy guarantees that the sitemap contains the same language alternatives as the language navigation component does. However, this is an opt-in functionality and in the majority of cases the default implementation of the strategy provided by the Sites SEO implementation bundle will work. It is preconfigured to use the 2nd level node as "site root" (e.g. /content/wknd) and search up to 2 levels deep for language roots. (e.g. /content/wknd/us/en).

I agree that the 2.17.6 release notes should probably make it clear that "Support for canonical links, robots tags and alternative language links" was added, not Sitemaps. @bpauli can you update that?