jdillard / sphinx-sitemap

Sphinx extension to generate a multi-lingual, multi-version sitemap for HTML builds
https://sphinx-sitemap.readthedocs.io/en/latest/index.html
MIT License
55 stars 22 forks source link

sitemap_locales option #25

Closed liborjelinek closed 4 years ago

liborjelinek commented 4 years ago

I propose to add the sitemap_locales configuration option to manually set locales that will show in the sitemap as alternates to page.

I've started to use sphinx-sitemap for my blog at https://blog.documatt.com. It's a single language and likely will stay. It's Sphinx with Ablog extension for blogging. It's a great extension that comes with many supported locales like Chinese, Korean, Spain, etc.

Unfortunately, it means that if another extension like sphinx-sitemap asks for app.builder.config.locale_dirs, the result is a long list of locale_dirs. Former get_locales() will then result in the same long list of locales. But they come from third-party extension!

So I add the sitemap_locales option to manually override locales which I need/want to list in sitemaps. By default, nothing changed (autodetection will take place).

But I can limit alternate URLs (e.g. sitemap_locales = ['es', 'fr']) or completely turn them off (sitemap_locales = [None])

jdillard commented 4 years ago

This makes sense, thanks for the pull request! I'm going to try to test this out and merge this week.

liborjelinek commented 4 years ago

To a multilingual sitemaps: I really dislike this style - primary language in, all languages in including primary, and repeat and for all languages. But it looks it's official way how to describe multilingual website.

Do I understand it correctly that if I have three pages - en/index.html, and alternatives es/index.html and fr/index.html - the following monster is the way how to say it in sitemap?

<url>
          <loc>https://my-site.com/docs/en/index.html</loc>
          <xhtml:link href="https://my-site.com/docs/es/index.html" hreflang="es" rel="alternate"/>
          <xhtml:link href="https://my-site.com/docs/fr/index.html" hreflang="fr" rel="alternate"/>
          <xhtml:link href="https://my-site.com/docs/en/index.html" hreflang="en" rel="alternate"/>
</url>
<url>
          <loc>https://my-site.com/docs/es/index.html</loc>
          <xhtml:link href="https://my-site.com/docs/fr/index.html" hreflang="es" rel="alternate"/>
          <xhtml:link href="https://my-site.com/docs/en/index.html" hreflang="fr" rel="alternate"/>
          <xhtml:link href="https://my-site.com/docs/es/index.html" hreflang="en" rel="alternate"/>
</url>
<url>
          <loc>https://my-site.com/docs/fr/index.html</loc>
          <xhtml:link href="https://my-site.com/docs/en/index.html" hreflang="es" rel="alternate"/>
          <xhtml:link href="https://my-site.com/docs/es/index.html" hreflang="fr" rel="alternate"/>
          <xhtml:link href="https://my-site.com/docs/fr/index.html" hreflang="en" rel="alternate"/>
</url>
jdillard commented 4 years ago

I agree, but I can only assume it is for making the sitemap easier to digest by machines. On this page, under Methods for indicating your alternate pages, it gives a sitemap example (as a dropdown) and states as the first guideline:

Each language version must list itself as well as all other language versions.

https://support.google.com/webmasters/answer/189077

jdillard commented 4 years ago

Thanks again for this addition!