Closed mjpost closed 2 years ago
The Anthology version is not in main Google either for me: https://www.google.com/search?q=BERT+Has+Uncommon+Sense%3A+Similarity+Ranking+for+Word+Sense+BERTology
It is correctly being shown in our sitemap, though:
$ curl -s https://aclanthology.org/sitemap_1.xml.gz | gunzip - | grep "2021.blackboxnlp-1.43"
<loc>https://aclanthology.org/2021.blackboxnlp-1.43/</loc>
I can only conclude that Google simply hasn't re-indexed us since this was added.
The google search console lists it as “submitted, not in sitemap”, with a last sitemap crawl date of November 11. That’s for the sitemap index. I added all the individual sitemaps to see if that helps. I think it reread the index but not the individual pieces (before #1658, there were only 3 pieces, now there are 5).
I think we need an SEO volunteer…
Well, it's in the very first piece, and it also seems to be in https://aclanthology.org/sitemap.xml, so I don't know what would be going on here.
But yeah, just one of several reasons I'd like to get rid of GCSE for our internal search engine ASAP. Being correctly (and timely) indexed in Google proper will still remain important, of course ...
Note that it is in Google (add site:aclanthology.org
to the query), just not really being surfaced.
But I agree with the goal.
As reported by @annargrs, using the Google Custom Search Engine box from the Anthology site does not turn up papers from EMNLP 2021. e.g.,
BERT Has Uncommon Sense: Similarity Ranking for Word Sense BERTology turns up no results:
Though it exists.