withastro / starlight

🌟 Build beautiful, accessible, high-performance documentation websites with Astro
https://starlight.astro.build
MIT License
4.73k stars 510 forks source link

sitemap no handler #2061

Closed huifer closed 3 months ago

huifer commented 3 months ago

What version of starlight are you using?

^0.24.4

What version of astro are you using?

^4.10.2

What package manager are you using?

pnpm

What operating system are you using?

Mac

What browser are you using?

Chrome

Describe the Bug

https://zen-huifer.pages.dev/sitemap-index.xml

no sitemap-index.xml

image

Link to Minimal Reproducible Example

https://github.com/huifer/zen-huifer/blob/cf/astro.config.mjs

Participation

huifer commented 3 months ago

image

delucis commented 3 months ago

Hi @huifer! I’m not sure I understand the issue. When loading https://zen-huifer.pages.dev/sitemap-index.xml, I see the sitemap index as expected.

huifer commented 3 months ago

May I ask if this can be included on Google

delucis commented 3 months ago

It should work, yes! For example, https://docs.astro.build/ is built with Starlight and we successfully submitted the sitemap to Google.

huifer commented 3 months ago

image @delucis

delucis commented 3 months ago

It doesn’t appear to be a Starlight issue as far as I can tell:

  1. The sitemap is generated correctly.
  2. The sitemap is available at https://zen-huifer.pages.dev/sitemap-index.xml as expected.
  3. Google reports that the index was processed successfully.

The error seems to be related to crawling your site. It might be that Cloudflare is blocking the Google bot crawling your site?

Given this is not a Starlight-specific problem, I will close this issue. Feel free to keep chatting here if it’s helpful, but it might also be good to look to see if there’s advice about indexing Cloudflare-hosted sites online.

huifer commented 3 months ago

@delucis Starlight roboots.txt Can the problem be solved

delucis commented 3 months ago

@huifer Starlight doesn’t add a robots.txt itself, but you can do this yourself easily:

  1. Create public/robots.txt inside your project

  2. Add the following content:

    User-agent: *
    Allow: /
    
    Sitemap: https://zen-huifer.pages.dev/sitemap-index.xml