Shopify / hydrogen-v1

React-based framework for building dynamic, Shopify-powered custom storefronts.
https://shopify.github.io/hydrogen-v1/
MIT License
3.75k stars 327 forks source link

[BUG] Sitemap doesn't work in Oxygen #2167

Open beppek opened 2 years ago

beppek commented 2 years ago

Describe the bug Sitemaps work fine when viewed locally in dev and preview mode. However, once deployed to Oxygen it returns a 404. We noticed this on our Oxygen deployments (private) and thought we should compare with other sites we know to be on Hydrogen+Oxygen.

https://skims.com/sitemap.xml <- 404 https://shopify.supply/sitemap.xml <- 404 https://denimtears.com/sitemap.xml <- All URLs point to the checkout subdomain i.e. https://denim-tears.myshopify.com (I assume they're using the sitemap auto generated by Shopify which means the tip to use that one in the docs is no good)

To Reproduce

  1. Deploy to Oxygen
  2. Visit yourdeployment.com/sitemap.xml

Expected behaviour Sitemaps should be generated

Additional context Add any other context about the problem here. eg.

frandiox commented 2 years ago

Hi! Thanks for reporting this. It seems the sitemap works in our demo-store so it might not be related to Oxygen but something in old versions of the demo-store. I've checked shopify.supply and the issue was indeed related to using an old version of sitemap.xml.server.ts, which is probably what's happening in other projects too.

I assume you are testing with the latest version in your project but we just discovered a bug in that file that shows up in certain conditions. Can you try this fix and see if it works for you? It's weird that it works for you in development, though... maybe it's an issue different from shopify.supply and the others 🤔

beppek commented 2 years ago

Hi @frandiox thanks, but it didn't work. We're still getting a 404. Interestingly the server logs say it's a 200 and no other information in the logs.

beppek commented 2 years ago

We haven't made any other changes to the sitemap.

frandiox commented 2 years ago

@beppek It turns out Oxygen does not serve /sitemap.xml and /robots.txt for public preview deployments to avoid indexing issues.

It should work for the production deployment (when using a custom domain) and for private deployments as well. Can you check that? We should definitely document this.

beppek commented 2 years ago

Ok, I see. Our production deployment isn't using a custom domain yet and we had to take it public so you could investigate our slow response times. Taking it private the sitemap doesn't return a 404 anymore, instead it crashes with the following error: error on line 1392 at column 34: xmlParseEntityRef: no name, which is strange because it doesn't crash when running local preview. Same code, same data.

No further info in the server logs.

blittle commented 2 years ago

@beppek Does it render at all when running locally in dev mode? What about running a deploy preview locally?

beppek commented 2 years ago

@blittle yes, it works fine in both dev and preview mode locally.

beppek commented 2 years ago

Ah no, sorry, I had env variables for a different store running. It works in preview mode locally, but not in dev mode locally.

mhuretski commented 1 year ago

error on line 1392 at column 34: xmlParseEntityRef: no name I had the same error, the solution was to replace & with &amp; while sitemap was generated

function escapeXml(unsafe?: string) {
    return unsafe?.replace(/[<>&'"]/g, function (c) {
        switch (c) {
            case '<':
                return '&lt;'
            case '>':
                return '&gt;'
            case '&':
                return '&amp;'
            case "'":
                return '&apos;'
            case '"':
                return '&quot;'
            default:
                return ''
        }
    })
}
function renderUrlTag({
    url,
    lastMod,
    changeFreq,
    image,
}: {
    url: string
    lastMod?: string
    changeFreq?: string
    image?: {
        url: string
        title?: string
        caption?: string
    }
}) {
    return `
    <url>
      <loc>${escapeXml(url)}</loc>
      <lastmod>${escapeXml(lastMod)}</lastmod>
      <changefreq>${escapeXml(changeFreq)}</changefreq>
      ${
          image
              ? `
        <image:image>
          <image:loc>${escapeXml(image.url)}</image:loc>
          <image:title>${escapeXml(image.title) ?? ''}</image:title>
          <image:caption>${escapeXml(image.caption) ?? ''}</image:caption>
        </image:image>`
              : ''
      }

    </url>
  `
}
benjaminsehl commented 1 year ago

@macpham @timlombardo - we should reconsider blocking (without allowed override) robots/sitemap for preview deployments, or otherwise consider how we might make some of this stuff transparent.