RavanH / xml-sitemap-feed

XML Sitemap & Google News feeds
GNU General Public License v2.0
16 stars 21 forks source link

sitemap-home.xml contains only either URL <loc> /en/ or /de/ but never all startpages (Polylang) #15

Closed gerdneuman closed 6 years ago

gerdneuman commented 6 years ago

I think I've come across an incompatibility with regards to Polylang. We've set up our startpage / to redirect to either /de/ or /en/. See here: https://polylang.pro/doc/url-modifications/#front-page-url

What happens:

Both pages should hence be in sitemap-home.xml, but only one of them is depending on in which language the request comes in. Language is de (german) in our case by default, and en otherwise.

See these outputs:

Default request, only https://www.flyingroasters.de/en/ is output, but https://www.flyingroasters.de/de/ is missing.

$ curl --silent 'https://www.flyingroasters.de/sitemap-home.xml'
<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" href="https://www.flyingroasters.de/wp-content/plugins/xml-sitemap-feed/includes/xsl/sitemap.xsl?ver=4.9.4"?>
<!-- generated-on="2018-08-03T09:15:02+00:00" -->
<!-- generator="XML & Google News Sitemap Feed plugin for WordPress" -->
<!-- generator-url="https://status301.net/wordpress-plugins/xml-sitemap-feed/" -->
<!-- generator-version="4.9.4" -->
<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9"
        xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
        xsi:schemaLocation="http://www.sitemaps.org/schemas/sitemap/0.9
                http://www.sitemaps.org/schemas/sitemap/0.9/sitemap.xsd">
        <url>
                <loc>https://www.flyingroasters.de/en/</loc>
                <lastmod>2018-08-02T05:28:16+00:00</lastmod>
                <priority>1.0</priority>
        </url>
</urlset>

Using the Polylang cookie to set the language to German, it is the same:

$ curl --silent 'https://www.flyingroasters.de/sitemap-home.xml' -H 'Cookie: STYXKEY_pll_language=de'
<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" href="https://www.flyingroasters.de/wp-content/plugins/xml-sitemap-feed/includes/xsl/sitemap.xsl?ver=4.9.4"?>
<!-- generated-on="2018-08-03T09:15:37+00:00" -->
<!-- generator="XML & Google News Sitemap Feed plugin for WordPress" -->
<!-- generator-url="https://status301.net/wordpress-plugins/xml-sitemap-feed/" -->
<!-- generator-version="4.9.4" -->
<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9"
        xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
        xsi:schemaLocation="http://www.sitemaps.org/schemas/sitemap/0.9
                http://www.sitemaps.org/schemas/sitemap/0.9/sitemap.xsd">
        <url>
                <loc>https://www.flyingroasters.de/en/</loc>
                <lastmod>2018-08-02T05:28:16+00:00</lastmod>
                <priority>1.0</priority>
        </url>
</urlset>

But with setting the language to English via Cookie one gets https://www.flyingroasters.de/de/ but now https://www.flyingroasters.de/en/ is missing from output (I think this is how google crawler would see it, because it uses English as default language):

$ curl --silent 'https://www.flyingroasters.de/sitemap-home.xml' -H 'Cookie: STYXKEY_pll_language=en'
<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" href="https://www.flyingroasters.de/wp-content/plugins/xml-sitemap-feed/includes/xsl/sitemap.xsl?ver=4.9.4"?>
<!-- generated-on="2018-08-03T09:15:50+00:00" -->
<!-- generator="XML & Google News Sitemap Feed plugin for WordPress" -->
<!-- generator-url="https://status301.net/wordpress-plugins/xml-sitemap-feed/" -->
<!-- generator-version="4.9.4" -->
<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9"
        xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
        xsi:schemaLocation="http://www.sitemaps.org/schemas/sitemap/0.9
                http://www.sitemaps.org/schemas/sitemap/0.9/sitemap.xsd">
        <url>
                <loc>https://www.flyingroasters.de/de/</loc>
                <lastmod>2018-08-02T05:28:16+00:00</lastmod>
                <priority>1.0</priority>
        </url>
</urlset>

Above the language is only set using the polylang cookie, but I think this also happens when using the Accept-Language header for language setting.

Expected:

All (here only two) startpages should be part of the sitemap-home.xml.

RavanH commented 6 years ago

Hi, thank you for this extensive report. Indeed, not what is supposed to happen.

I cannot reproduce this issue. One of my own sites using Polylang with the same front page url setting is https://phareo.eu/sitemap-home.xml is showing all language root pages as intended. So I'm wondering if it's not related to another setting or another plugin...

Or your child theme. Are there any functions there related to request filtering or polylang?

gerdneuman commented 6 years ago

You're right. After some debugging I found this is indeed caused by the following filter in our child theme:

add_filter( 'pll_the_languages_args', 'fr_use_slug_for_desktop_lang_switcher' );
function fr_use_slug_for_desktop_lang_switcher( $args ) {
    // For Mobile we want: English; For Desktop: EN
    // So this is done by configuring the Desktop main menu with "Hide current language"
    // but leave this disable on Mobile menu to differentiate between Mobile and Desktop Menu
    if ( !$args['hide_current'] ) {
        $args['hide_current'] = 1;
        $args['display_names_as'] = 'slug';
    }
    return $args;
}

I was surprised but this causes it.

(The reasoning why we use this filter is a bit complicated to explain: On Desktop we show the language switcher as EN or DE whereas on Mobile we have English or Deutsch. Both Mobile and Desktop have a different menu in our theme. As said it is complicated but it is working).

Well, maybe it is also just the $args['hide_current'] = 1; setting.

gerdneuman commented 6 years ago

Can confirm: Yes, it is just the $args['hide_current'] = 1; here.

I am not sure how to proceed here... Probably, this is not a xml-sitemap-feed bug but the above code should not be executed if run by xml-sitemap-feed. Not sure if there's a way to detect this?

gerdneuman commented 6 years ago

See https://polylang.pro/doc/function-reference/

pll_the_languages

Displays a language switcher.

[...]

‘hide_current’=> hides the current language if set to 1 (default: 0)
RavanH commented 6 years ago

Hmmm, there is the conditional is_sitemap() available.

You could do: if ( function_exists('is_sitemap') && is_sitemap() ) return $args; at the start of your filter there...

RavanH commented 6 years ago

Or WordPress core conditional wp_is_mobile() could be used instead of using the hide_current argument as a flag for mobile use?

gerdneuman commented 6 years ago

The tip with using is_sitemap() works fine, thank you very much!