rook / rook.github.io

Apache License 2.0
25 stars 30 forks source link

Old version documentation should not appear on Seach engines #144

Closed galexrt closed 1 year ago

galexrt commented 1 year ago

E.g., old 1.8 release docs are still appearing in search engine results when searching for anything Rook related.

galexrt commented 1 year ago

An idea that I have is to create a robots.txt during release docs build, then in some way to symlink that latest-release robots.txt to the root of the website that it is picked up accordingly.

Not sure if the symlinking part works, though we can generally test the robots.txt "theory" by manually adding a robots.txt now. @travisn Anything against adding a robots.txt directly to the gh-pages branch to see if it would technically even help with the issue?

travisn commented 1 year ago

An idea that I have is to create a robots.txt during release docs build, then in some way to symlink that latest-release robots.txt to the root of the website that it is picked up accordingly.

Not sure if the symlinking part works, though we can generally test the robots.txt "theory" by manually adding a robots.txt now. @travisn Anything against adding a robots.txt directly to the gh-pages branch to see if it would technically even help with the issue?

Sounds great to add the robots.txt directly to the branch and test this out for now.

jbw976 commented 1 year ago

We dealt with this in Crossplane as well - google was very slow/unreliable to update their indexing, even when older docs were explicitly deleted from the site entirely.

See some discussion here: https://github.com/crossplane/docs/issues/107#issuecomment-990338800

Now, as part of our release process, we use Google Search Console to request a removal for the docs version that is now stale (unsupported) according to our release policies.

e.g. On https://search.google.com/search-console/removals, we add a removal for Starts with: https://crossplane.io/docs/v1.7.

galexrt commented 1 year ago

@jbw976 @travisn I don't have access to the Google Search Console of rook.io, does one of you have access to it?

travisn commented 1 year ago

Ok I found access to the Google search console for rook.io and submitted a request to remove index on 1.8 and 1.9 docs. The request is still in progress, so let's check back soon if the search is finding the newer docs.

travisn commented 1 year ago

@rathpc Google search is also finding your test publish site for the rook docs. How about taking it down? For example, one of the top test search results was here: https://rathpc.github.io/rook.github.io/docs/rook/v1.4/k8s-pre-reqs.html

rathpc commented 1 year ago

@travisn removed 👍

galexrt commented 1 year ago

@travisn how about closing this issue for now as we have a workaround?