Open danfox01 opened 6 years ago
@danfox01 I did some research and uncovered two possible strategies:
1. As recently as 2017, Google CSE had a documented option that dealt with this issue:
If your pages have regions containing boilerplate content that's not relevant to the main content of the page, you can identify it using the nocontent class attribute. When Google Custom Search sees this tag, we'll ignore any keywords it contains and won't take them into account when calculating ranking for your Custom Search engine. (We'll still follow and crawl any links contained in the text marked nocontent.) To use the nocontent class attribute, include the boilerplate content in a tag (for example, span or div) like this:
<div class="nocontent"> <!-- The area to exclude --> </div>
To activate this option, we can try adding this to our cse.xml: enable_nocontent_tag="true" However, because this option is no longer documented in CSE support, it may no longer work.
2. I came across advice that suggests CSE provides fewer and more accurate results if we put quotes around individual query terms. This technique seems to work for "meeting room". Compare a BCPL site search for "meeting room" with: "meeting" "room" It doesn't seem to work for "digital movies", though. With additional testing, if we determine that the accuracy of search results is generally improved by quoting individual query terms, we could consider post-processing search queries to add quotes around all terms in BCPL site searches.
As reported by @jdomasky:
I have looked into this a bit and haven't found a legimate whay to prevent Google from indexing the nav. It's not a huge problem because Google still prioritizes the more relevant results over the ones with just a nav hit, but if there's anything that can be done, I'd like to know.