iamvishnusankar / next-sitemap

Sitemap generator for next.js. Generate sitemap(s) and robots.txt for all static/pre-rendered/dynamic/server-side pages.
https://next-sitemap.iamvishnusankar.com
MIT License
3.27k stars 126 forks source link

Not using href values from alternaateRefs array and instead original URL path for all hreflang entries #796

Closed markojak closed 3 weeks ago

markojak commented 5 months ago

Describe the bug

When generating a sitemap for a Next.js application with multiple languages we incorrectly generates the hreflang attributes in the sitemap XML. It seems to have a limitation or bug in how it handles the alternateRefs array when generating the sitemap XML. The package is not correctly using the href values from the alternateRefs array and is instead using the original URL path for all hreflang entries.

To Reproduce

(code below)

In the generated sitemap XML, the xhtml:link elements for alternate language URLs should have the correct hreflang and href attributes. The href attribute should contain the appropriate URL for each alternate language, matching the hreflang value.

For example, for a URL like https://example.com/fr/blog, the expected xhtml:link elements should be:

<xhtml:link rel="alternate" hreflang="en" href="https://example.com/en/blog"/>
<xhtml:link rel="alternate" hreflang="es" href="https://example.com/es/blog"/>
<xhtml:link rel="alternate" hreflang="de" href="https://example/de/blog"/>

Actual behavior

In the generated sitemap XML, the xhtml:link elements for alternate language URLs have incorrect href attributes. The href attribute contains the same URL as the main URL, instead of the corresponding alternate language URL.

For example, for a URL like https://example.com/fr/blog, the actual xhtml:link elements in the sitemap XML are:

<xhtml:link rel="alternate" hreflang="en" href="https://example.com/fr/blog"/>
<xhtml:link rel="alternate" hreflang="es" href="https://example.com/fr/blog"/>
<xhtml:link rel="alternate" hreflang="de" href="https://example.com/fr/blog"/>
<xhtml:link rel="alternate" hreflang="fr" href="https://example.com/fr/blog"/>

Additional context

Next.js version: "next": "13.4.10-canary.1", Latest sitemap version

Implementation

require('ts-node').register();
const { SUPPORTED_LOCALES } = require('./src/constants.ts');

const generateAlternateRefs = (path, locales) => {
    console.log('Path:', path);
    console.log('Locales:', locales);

    const alternateRefs = locales.map((locale) => {
        const localePath = path.replace(new RegExp(`^/(${locales.join('|')})`), `/${locale}`);
        return {
            hreflang: locale,
            href: `https://example.com${localePath}`,
        };
    });
    return alternateRefs;
};

/** @type {import('next-sitemap').IConfig} */
module.exports = {
    siteUrl: 'https://example.com',
    changefreq: 'daily',
    priority: 0.7,
    sitemapSize: 5000,
    generateRobotsTxt: true,
    transform: async (config, path) => {
        console.log('Transform Path:', path);

        const alternateRefs = generateAlternateRefs(path, Object.keys(SUPPORTED_LOCALES));
        console.log('Generated Alternate Refs:', alternateRefs);

        return {
            loc: `${config.siteUrl}${path}`,
            changefreq: config.changefreq,
            priority: config.priority,
            lastmod: config.autoLastmod ? new Date().toISOString() : undefined,
            alternateRefs,
        };
    },
    robotsTxtOptions: {
        policies: [
            {
                userAgent: '*',
                allow: '/',
            },
        ],
    },
};
luukkynda commented 5 months ago

I haven't tested this yet, but since hreflang is defined appropriately, did you consider the fact that your regex might be incorrect/returns something different then you'd expect? Since this is a custom transformation, I would assume that something is going wrong in the transformation and that it hasn't got much to do with the package itself

arno-fukuda commented 4 months ago

Same issue here.

alternateRefs inside transform doesn't work and returns the same url from loc.

lt7 commented 3 months ago

+1 seeing the same, details below

Incorrect alternateRefs for Locale-Specific Pages with next-i18next

Description:

When using next-sitemap with next-i18next for internationalization, the alternateRefs in the sitemap are not generated correctly for locale-specific pages. The href values for the alternate language versions remain the same as the current page's URL, even when using the transform function to modify the paths.

Example:

next-i18next.config.js:

const { resolve } = require('path');

const DEFAULT_LOCALE = process.env.DEFAULT_LOCALE || 'en';

const config = {
  i18n: {
    defaultLocale: DEFAULT_LOCALE,
    locales: [DEFAULT_LOCALE, 'es', 'fr'], // Simplified locales for example
  },
  // ... other i18n configurations
};

module.exports = config;

next-sitemap.config.js:

const siteUrl = process.env.NEXT_PUBLIC_SITE_URL;
const { i18n } = require('./next-i18next.config'); 

module.exports = {
  siteUrl,
  generateRobotsTxt: true,
  i18n,
  transform: async (config, path) => {
    // Simplified transform function for example
    const loc = `${config.siteUrl}${path}`;
    const alternateRefs = [];
    config.i18n.locales.forEach(locale => {
      const href = locale === config.i18n.defaultLocale
        ? `${config.siteUrl}${path}` 
        : `${config.siteUrl}/${locale}${path.substring(3)}`; 

      alternateRefs.push({
        href: href,
        hreflang: locale,
      });
    });
    return {
      loc,
      lastmod: new Date().toISOString(),
      alternateRefs,
    };
  },
};

Expected Output:

<url>
  <loc>https://myapp.firebaseapp.com/es/about</loc>
  <xhtml:link rel="alternate" hreflang="en" href="https://myapp.firebaseapp.com/about"/>
  <xhtml:link rel="alternate" hreflang="es" href="https://myapp.firebaseapp.com/es/about"/>
  <xhtml:link rel="alternate" hreflang="fr" href="https://myapp.firebaseapp.com/fr/about"/>
</url>

Actual Output

<url>
  <loc>https://myapp.firebaseapp.com/es/about</loc>
  <xhtml:link rel="alternate" hreflang="en" href="https://myapp.firebaseapp.com/es/about"/>
  <xhtml:link rel="alternate" hreflang="es" href="https://myapp.firebaseapp.com/es/about"/>
  <xhtml:link rel="alternate" hreflang="fr" href="https://myapp.firebaseapp.com/es/about"/> 
</url>

As you can see, the href values for the alternate language versions are incorrect. They all point to the Spanish version of the page (/es/about) instead of the corresponding English and French versions. I have 20+ languages, so it's a bit of problem.

Steps Taken:

Environment:

arno-fukuda commented 3 months ago

@lt7 , it was difficult to spot in their documentation, but I was able to get it working by using the following:

...
   return {
      loc: `${config.siteUrl}${url}`,
      lastmod: new Date().toISOString(),
      changefreq: "daily",
      priority: 0.7,
      alternateRefs: config.alternateRefs ?? [
        {
          href: `${config.siteUrl}${url}`,
          hreflang: "en",
          hrefIsAbsolute: true,
        },
        {
          href: `${config.siteUrl}/ja-JP${url}`,
          hreflang: "ja",
          hrefIsAbsolute: true,
        },
      ],
    }
  },
...

hrefIsAbsolute: true did the trick.

lt7 commented 3 months ago

@arno-fukuda I think you meant invisible not difficult, but thank you very much that seems to have helped !

Abdullah-J01 commented 3 months ago

In my case, I am using domain routing So my en-PK url looks like this: https://example.pk/about And en-AE url looks like this: https://example.com/about

I am having similar issue that the base url of loc is copied to both hreflangs i-e: https://example.pk/about

@arno-fukuda - hrefIsAbsolute did not work in this case

Any idea?

arno-fukuda commented 3 months ago

@Abdullah-J01 , In the return statement don't use config.siteUrl, but try a custom variable.

module.exports = {
  siteUrl: "https://example.com"

  transform: async (config, path) => {
   ...
    const siteUrls = {
      'en': 'https://example.com',
      'pk': 'https://example.pk',
    };
   ...

    return {
      loc: `${siteUrl}${url}`,
      lastmod: new Date().toISOString(),
      changefreq: 'daily',
      priority: 0.7,
      alternateRefs: [
        {
          href: `${siteUrls['en']}${url}`,
          hreflang: 'en',
          hrefIsAbsolute: true,
        },
        {
          href: `${siteUrls['pk']}${url}`,
          hreflang: 'pk',
          hrefIsAbsolute: true,
        },
      ],
    };

...

Does it work?

github-actions[bot] commented 1 month ago

Closing this issue due to inactivity.