kjvarga / sitemap_generator

SitemapGenerator is a framework-agnostic XML Sitemap generator written in Ruby with automatic Rails integration. It supports Video, News, Image, Mobile, PageMap and Alternate Links sitemap extensions and includes Rake tasks for managing your sitemaps, as well as many other great features.
MIT License
2.44k stars 276 forks source link

Feature request : Make `default_host` optional / Allow full URL instead of path #414

Closed lcoq closed 2 years ago

lcoq commented 2 years ago

Goal

I'm trying to reproduce this Google documentation example which uses multiple hosts :

 <url>
    <loc>https://www.example.com/english/page.html</loc>
    <xhtml:link
               rel="alternate"
               hreflang="de"
               href="https://www.example.de/deutsch/page.html"/>
    <xhtml:link
               rel="alternate"
               hreflang="de-ch"
               href="https://www.example.de/schweiz-deutsch/page.html"/>
    <xhtml:link
               rel="alternate"
               hreflang="en"
               href="https://www.example.com/english/page.html"/>
  </url>
  <url>
    <loc>https://www.example.de/deutsch/page.html</loc>
    <xhtml:link
               rel="alternate"
               hreflang="de"
               href="https://www.example.de/deutsch/page.html"/>
    <xhtml:link
               rel="alternate"
               hreflang="de-ch"
               href="https://www.example.de/schweiz-deutsch/page.html"/>
    <xhtml:link
               rel="alternate"
               hreflang="en"
               href="https://www.example.com/english/page.html"/>
  </url>
  <url>
    <loc>https://www.example.de/schweiz-deutsch/page.html</loc>
    <xhtml:link
               rel="alternate"
               hreflang="de"
               href="https://www.example.de/deutsch/page.html"/>
    <xhtml:link
               rel="alternate"
               hreflang="de-ch"
               href="https://www.example.com/schweiz-deutsch/page.html"/>
    <xhtml:link
               rel="alternate"
               hreflang="en"
               href="https://www.example.com/english/page.html"/>
  </url>

Problem

The SitemapGenerator::Sitemap.default_host is required and always added to the URL passed as #add first argument, even when the URL has a protocol and a host.

Code

SitemapGenerator::Sitemap.default_host = "https://dont-want-to-use.it"
SitemapGenerator::Sitemap.create do
  each_locale do |locale|
    each_page do |page|
      main_url = page.url(only_path: false, host: host_for_locale(locale))
      alternate = each_locale.map { |locale| page.url(only_path: false, host: host_for_locale(locale)) }
      add main_url, alternate: alternate
    end
  end
end

Just in case, some helpers definitions :

def each_locale(&block)
  return to_enum(:each_locale) unless block_given?
  I18n.locales.each do |locale|
    I18n.locale = locale
    yield locale
  end
end

def each_page(&block)
  Page.with_locale(I18n.locale).find_each(&block)
end

def host_for_locale(locale)
  locale == fr ? "myhost.fr" : "myhost.com"
end

But the result looks like this :

<url>
  <loc>https://dont-want-to-use.it/https://myhost.com/pages/1</loc>
  <xhtml:link rel="alternate" href="https://myhost.com/pages/1" hreflang="en"/>
  <xhtml:link rel="alternate" href="https://myhost.fr/pages/1" hreflang="fr"/>
</url>

Did I miss something ? Is there any other way to reproduce the sitemap above ?

Thank you !

kjvarga commented 2 years ago

Just pass the ‘host’ option to ‘add’. See https://github.com/kjvarga/sitemap_generator#supported-options-to-add

lcoq commented 2 years ago

Oh sorry, didn't see it ! Thank you !