OpenRefine / openrefine.org

Source website for openrefine.org
https://openrefine.org
Other
133 stars 119 forks source link

Add sitemap so search engine results can showcase community and extensions, etc. just a bit better #276

Open thadguidry opened 9 months ago

thadguidry commented 9 months ago

When folks search "openrefine", many of the search engine results only show the following 4 areas:

DownloadDownload OpenRefine. OpenRefine is free software ...
User ManualThis manual is designed to comprehensively walk through ...
Running OpenRefineWith openrefine.exe​. You can run OpenRefine by double-clicking ...
Installing OpenRefineInstall or upgrade OpenRefine​ · The quick version: Install ...

And we'd ideally have a sitemap, so that way we can also easily showcase much more important (in my mind) /community (with the Discourse forum) or maybe only the forum? as well as /whats_new and /extensions. Maybe other things from the footer or top nav?

Here's the plugin: https://docusaurus.io/docs/api/plugins/@docusaurus/plugin-sitemap

tfmorris commented 9 months ago

Are you seeing pages which aren't being indexed? Sitemaps help with discoverability, but don't influence prioritization of results. Search engine priorities are driven by searchers, not publishers (and advertisers, of course, but that's a whole 'nother kettle of fish).

thadguidry commented 9 months ago

@tfmorris That is false information as of about circa ~ 2009, which you probably are not aware of, but that's fine. Here's some more information:

Include the URLs in your sitemap that you want to see in Google's search results. Google generally shows the canonical URLs in its search results, which you can influence with sitemaps. If you have different URLs for mobile and desktop versions of a page, we recommend pointing to only one version in a sitemap. However, if you want to point to both URLs, annotate your URLs to indicate the desktop and mobile versions.

It is true, however, that for instance XML sitemaps, Google and others ignore certain things:

  • Google ignores <priority> and <changefreq> values.
  • Google uses the <lastmod> value if it's consistently and verifiably (for example by comparing to the last modification of the page) accurate.
thadguidry commented 9 months ago

I think I see what is going on. Here's what I'm seeing:

Not seeing Forum (instead I see Community being ranked higher). But I think that's because of it's subdomain perhaps? Still it's something I would more appreciate that if someone typed "openrefine help" that the Forum would have higher rank across many search engines, Google, Bing, Yandex, DuckDuckGo, etc. but I'm just not seeing that. Perhaps we need to improve the metadata on the Forum?

Not seeing Extensions unless I explicitly ask for them "openrefine extensions", but overall, I think instead the improvement we'd likely want to see for users would be directly on our Downloads page, with a section that says Extensions, and then links to the Extension page? That seems like a better way to advertise that there are extensions for OpenRefine, because I doubt folks would think about typing "openrefine extensions" as a new user, and much more likely to spot that we have extensions when the user is on the Download page, as long as Extensions are prominently displayed there?

tfmorris commented 9 months ago

@tfmorris That is false information as of about circa ~ 2009, which you probably are not aware of, but that's fine.

Srsly?