Open cameel opened 2 years ago
It works now. maybe it's a lagging in indexing?
I still see this happening. For example searching for solidity revert
I get this:
The upper result is from 0.8.13 while the lower one is for 0.8.15. I think we're seeing different results because Google has all versions still indexed and you can get hits from ones that are disallowed in robots.txt
and ones that are allowed, depending on what you search for.
So the question would be how to stop Google from returning results from blocked versions and return newer ones instead. I wonder if it's actually possible - we assumed that blocking them in robots.txt
would remove them from search completely but maybe that's not how it works.
disallow
will stop Google from indexing/crawling the contents of the page, not the URL itself. This is why the URL for previous versions still appear but the meta description does not. Reference from Google
To stop a URL being indexed by Google, you must add <meta name="robots" content="noindex">
to relevant pages to assist with SEO. Funnily enough, you will have to redact the disallow
field from the root robots.txt i.e. this file to allow Google's web crawler to read the relevant tag s.t. that it will not index it Reference from Google. I noticed that there are robots.txt for each directory which will be ignored as Google or search engine crawlers only reference the root directory.
Keep in mind that once the noindex
tag is added, it will not be removed from search results until it is crawled. To expedite that, using the URL Inspection tool from Google will allow you to request an index. URL Inspection tool.
Although the pages will be removed after Google's crawlers reach your page again, it does not necessarily mean the desired pages will be in their desired rankings after a search so keep that in mind for SEO.
Hope that helps!
Thanks! That explains a lot.
Not sure if we'll be able to keep those old version out of Google Search then since it might not be feasible to rebuild them. Especially if it would require changing code in the repo in already tagged releases.
In any case, pinging @r0qs since this is one of the topics, we'll want him to take over eventually.
@chriseth reports that search hits in our docs look like this in Google:
Here's what Googles's help says about this: No page information in search results.
I'm pretty sure this has something to do with the
robots.txt
changes we did some time ago #10898. The search result seems to be fromdevelop
, which ourrobots.txt
block. We only allowlatest
,v0.7.6
and latest release. The question is - why isdevelop
still getting indexed (and appears in results before those other allowed versions) if we blocked it?