harlan-zw / nuxt-seo

The complete SEO solution for Nuxt.
https://nuxtseo.com
949 stars 56 forks source link

Enable crawl via robots but prevent site indexing via noindex #215

Open ralph-burstsms opened 2 months ago

ralph-burstsms commented 2 months ago

Details

Google mentioned to properly avoid site indexing, the page must not be blocked via robots.txt. It looks like robots plugin isn't working this way.

See important note here: https://developers.google.com/search/docs/crawling-indexing/block-indexing

Currently, i'm only using

  site: { indexable: process.env.NUXT_SITE_ENV === "production" }, // false

But it renders both the robots.txt

# START nuxt-simple-robots (indexing disabled)
User-agent: *
Disallow: /

# END nuxt-simple-robots

and the noindex tags

<meta name="robots" content="noindex, nofollow">

Am i doing this correctly, or i missed a configuration?

harlan-zw commented 2 months ago

Hmm, this seems like you're trying to fix a page that has already been indexed and trying to get it unindexed? I'd suggest using the removal tools.

I think modifying the robots.txt is a bit risky for sites that don't have this issue. I could add an escape hatch config if it helps, otherwise please share some ideas.

Btw you shouldn't need the nuxt.config site config you shared, this is the default behaviour.