nuxt-modules / sitemap

Powerfully flexible XML Sitemaps that integrate seamlessly, for Nuxt.
https://nuxtseo.com/sitemap
MIT License
336 stars 31 forks source link

help: i18n sitemap with more than 50000 urls #372

Closed chichi13 closed 1 month ago

chichi13 commented 1 month ago

📚 What are you trying to do?

I have an i18n website with a lot of URLs, I'm about to pass the 50,000 URL mark. As you may know, Google limits the number of URLs per sitemap to 50,000. So I'd like to know how I can separate my sitemap into several? I looked at the documentation but couldn't get it to work. Here's the code I currently have:

sitemap: {
  cacheMaxAgeSeconds: 3600,
  gzip: true,
  exclude: ["/admin/**", "/auth/**"],
  urls: async () => {
    const baseUrl = process.env.API_BASE_URL || "http://localhost:8000";
    const languages = ["", "fr", "es"]; // Empty string for default locale (no prefix)
    const sources = [
      `${baseUrl}/api/v1/sitemap/events`,
      `${baseUrl}/api/v1/sitemap/streamers`,
      `${baseUrl}/api/v1/sitemap/games`,
    ];

    const fetchUrls = async (source: string) => {
      const response = await fetch(source);
      const urls = await response.json();
      return urls.flatMap((url) =>
        languages.map((lang) => ({
          loc: lang ? `/${lang}${url.loc}` : url.loc,
          lastmod: url.lastmod,
          priority: url.priority,
          image: url.image
            ? [
                {
                  loc: url.image.loc,
                  title: url.image.title,
                  caption: url.image.caption,
                },
              ]
            : undefined,
        }))
      );
    };

    const allUrls = await Promise.all(sources.map(fetchUrls));
    return allUrls.flat();
  },
},

As my URLs are the same between the different languages, the frontend divides the URL /fr, /es and / for English by default.

Currently my sitemap looks like this:

http://localhost:3000/__sitemap__/en-US.xml
http://localhost:3000/__sitemap__/es-ES.xml
http://localhost:3000/__sitemap__/fr-FR.xml

How can I get this kind of sitemap:

http://localhost:3000/__sitemap__/en-US.xml
http://localhost:3000/__sitemap__/en-US-2.xml
http://localhost:3000/__sitemap__/es-ES.xml
http://localhost:3000/__sitemap__/es-ES-2.xml
http://localhost:3000/__sitemap__/fr-FR.xml
http://localhost:3000/__sitemap__/fr-FR-2.xml

Or another solution?

🔍 What have you tried?

I've tried with sitemaps: true and defaultSitemapsChunkSize. I also tried with manual chunking but I couldn't do what I wanted to do.

ℹī¸ Additional context

My backend is a FastAPI (Python) backend, I can of course change the code if needed.

rayblair06 commented 1 month ago

Unfortunately I don't think the chunking functionality is currently supported when combined with custom urls or sources.

https://github.com/nuxt-modules/sitemap/issues/265

chichi13 commented 1 month ago

Okay this is what I was thinking.

I ended up doing it like this:

  sitemap: {
    cacheMaxAgeSeconds: 3600,
    gzip: true,
    exclude: ["/admin/**", "/auth/**"],
    autoLastmod: true,
    sitemaps: {
      events: {
        sources: [
          `${
            process.env.API_BASE_URL || "http://localhost:8000"
          }/api/v1/sitemap/events`,
        ],
      },
      streamers: {
        sources: [
          `${
            process.env.API_BASE_URL || "http://localhost:8000"
          }/api/v1/sitemap/streamers`,
        ],
      },
      games: {
        sources: [
          `${
            process.env.API_BASE_URL || "http://localhost:8000"
          }/api/v1/sitemap/games`,
        ],
      },
      "games-2": {
        sources: [
          `${
            process.env.API_BASE_URL || "http://localhost:8000"
          }/api/v1/sitemap/games?page=2`,
        ],
      },
      "games-3": {
        sources: [
          `${
            process.env.API_BASE_URL || "http://localhost:8000"
          }/api/v1/sitemap/games?page=3`,
        ],
      },
      "games-4": {
        sources: [
          `${
            process.env.API_BASE_URL || "http://localhost:8000"
          }/api/v1/sitemap/games?page=4`,
        ],
      },
      "games-5": {
        sources: [
          `${
            process.env.API_BASE_URL || "http://localhost:8000"
          }/api/v1/sitemap/games?page=5`,
        ],
      },
      "events-groups": {
        sources: [
          `${
            process.env.API_BASE_URL || "http://localhost:8000"
          }/api/v1/sitemap/events/groups`,
        ],
      },
      pages: {
        includeAppSources: true,
      },
    },
  },

Not a big fan because it's not dynamic. So if someone has a dynamic solution I'll take it :D

harlan-zw commented 1 month ago

Glad you could find a workaround, you will need to wait for official support of https://github.com/nuxt-modules/sitemap/issues/265.

Will track in that issue.