Per Google's documentation (link) about how robots.txt is insufficient at excluding a page from being indexed,
A page that's disallowed in robots.txt can still be indexed if linked to from other sites.
While Google won't crawl or index the content blocked by a robots.txt file, we might still find and index a disallowed URL if it is linked from other places on the web. As a result, the URL address and, potentially, other publicly available information such as anchor text in links to the page can still appear in Google search results. To properly prevent your URL from appearing in Google search results, password-protect the files on your server, use the noindex meta tag or response header, or remove the page entirely.
Related to https://github.com/OpenLiberty/blogs/issues/2269
Support a new custom front matter attribute that will add the
<meta name="robots" content="noindex">
to the page's HTML.The solution is based on https://developers.google.com/search/docs/advanced/crawling/block-indexing
Per Google's documentation (link) about how
robots.txt
is insufficient at excluding a page from being indexed,