nystudio107 / craft-seomatic

SEOmatic facilitates modern SEO best practices & implementation for Craft CMS 3. It is a turnkey SEO system that is comprehensive, powerful, and flexible.
https://nystudio107.com/plugins/seomatic
Other
162 stars 68 forks source link

Is it possible to have separate xml and page paths in sitemaps? #1427

Closed benfeather closed 4 months ago

benfeather commented 4 months ago

Question

I'm using Craft in headless mode and have the CMS and the frontend setup on different domains, like so:

The CMS is located at: cms.example.com The website is located at: www.example.com

When I use SEOmatic to generate sitemaps, it generates this output:

Index:

<sitemapindex xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">
    <sitemap>
        <loc>https://www.example.com/sitemaps-1-section-blog-1-sitemap.xml</loc>
        <lastmod>2024-02-23T11:10:02+13:00</lastmod>
    </sitemap>
</sitemapindex>

blog:

<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9"
    xmlns:image="http://www.google.com/schemas/sitemap-image/1.1"
    xmlns:video="http://www.google.com/schemas/sitemap-video/1.1"
    xmlns:xhtml="http://www.w3.org/1999/xhtml">
    <url>
        <loc>https://www.example.com/blog</loc>
        <lastmod>2024-02-23T11:10:02+13:00</lastmod>
        <changefreq>weekly</changefreq>
        <priority>0.5</priority>
    </url>
</urlset>

In the sitemap index file, the location of the XML files should be cms.example.com not www.example.com.

I see there is a Site URL Override option but if I change that to the CMS's domain, the output becomes:

Index:

<sitemapindex xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">
    <sitemap>
        <loc>http://cms.example.com/sitemaps-1-section-blog-1-sitemap.xml</loc>
        <lastmod>2024-02-23T11:10:02+13:00</lastmod>
    </sitemap>
</sitemapindex>

Blog:

<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9"
    xmlns:image="http://www.google.com/schemas/sitemap-image/1.1"
    xmlns:video="http://www.google.com/schemas/sitemap-video/1.1"
    xmlns:xhtml="http://www.w3.org/1999/xhtml">
    <url>
        <loc>http://cms.example.com/blog</loc>
        <lastmod>2024-02-23T11:10:02+13:00</lastmod>
        <changefreq>weekly</changefreq>
        <priority>0.5</priority>
    </url>
</urlset>

The index paths are now correct (the sitemap paths are using cms.example.com), however this also changes the page urls.

Is there a way to set this up so all xml files use the cms url: cms.example.com And all of the page paths use the site url: www.example.com

Any help would be appreciated.

khalwat commented 4 months ago

According to the spec, the sitemap files need to be served from the same domain as the URLs they point to:

https://www.sitemaps.org/protocol.html

Note that this means that all URLs listed in the Sitemap must use the same protocol (http, in this example) and reside on the same host as the Sitemap. For instance, if the Sitemap is located at http://www.example.com/sitemap.xml, it can't include URLs from http://subdomain.example.com.

benfeather commented 4 months ago

Well, I learned something today! Thank you very much.