Closed zlwaterfield closed 5 months ago
Why is this not addressed :/ I have tons of 404 urls because of it in my search console 😓
Google won't pick the initialCanonicalUrl
in tht html respnose for SEO, that value is only for internal state. The canonical url should be configured through Metadata API through alternates.canonical
. https://nextjs.org/docs/app/api-reference/functions/generate-metadata then google can pick it up properly.
@huozhi can you stop closing issue, everyone have same issue, crawler picking up everything self.__next_f
inside, i have so many 404 url's
for refs #53274 #40143 #41433
@c0b41 If the assumption is that google crawler read those content and parse it as canonical url, I'd assume there will be a much wider impact. Or it could also be search console having issues with specific app. There're only screenshots in 40143 that is not available to investigate.
I will be happy to conduct a google meet and show you my own search console how thousands of urls are considered 404 by google because of initialCanonical. Moreover, in another project i had to go over 40k static pages i have and add a script to modify this variable so that google wont complain about it 🤷♂️
This shouldn't be closed, 404s and 308, Google is picking up initialCanonicalUrl
I wonder if it's related to this fix (#67135), when you have a static not found page, but since it's missing noindex
so that google still indexed it but actually it should be ignored.
@huozhi To be honest I dont think so, my site is not statically generated, and I see a noindex tag within the 404 pages. My site's version does not include #67135 fix yet. I think it's much more simple than that.. I think google simply inspects the content of the page (just like a simple view source) and it recognizes variables that matches the pattern of links.. e.g contains slashes... and simply treats those as "links" coming from the page.. That's my theory.. I think that, because whenever I have a page in a folder like: [...paths], I see that the paths variable which is also embeded inside the inline content of the page, is also being considered as links by google.. #40143
This closed issue has been automatically locked because it had no new activity for 2 weeks. If you are running into a similar issue, please create a new issue with the steps to reproduce. Thank you.
Link to the code that reproduces this issue
https://github.com/zlwaterfield/initial-canonical-url-bug
To Reproduce
self.__next_f.push(....
initialCanonicalUrl
and it will be missing the base path (path
).Also see https://github.com/vercel/next.js/issues/53274 for more information.
Current vs. Expected behavior
The
initialCanonicalUrl
should have the basePath included in it.Verify canary release
Provide environment information
Which area(s) are affected? (Select all that apply)
App Router, Metadata (metadata, generateMetadata, next/head), Script optimization (next/script)
Additional context
This is causing issues with SEO because crawlers see the URL and think it's a valid URL. From what I gather, there is currently no way to properly set it. We are getting 404s from this in the Google Search Console.
My only idea to fix it right now is to rewrite the URL in our Cloudflare worker until a fix is shipped in Next.js