Open sneumann opened 4 years ago
since this has been there, unchanged, since March 15 2010 without comment maybe the most expeditious solution is to simple remove it?
I suppose something is supposed to replace the placeholder with content. Yes, would be awesome if it contained a list of all vignettes (HTML) webpages and/or all packages. Indeed, that sitemap.xml
can then be used by ELIXIR services to pick up content, e.g. ELIXIR TeSS but also BioSchemas (cc @AlasdairGray).
The site is more than the repository of packages, so sitemap.xml doesn't sound appropriate for this purpose.
For what it's worth package metadata is already available in machine-readable format as https://bioconductor.org/packages/3.12/bioc/VIEWS and presumably also on individual pages if this https://github.com/Bioconductor/bioconductor.org/pull/25 were completed. I can't see the need for a third source of this information.
The sitemap.xml
is not critical, I agree. (Any sitemap.xml
has redundant information.)
It is a way of search engine optimisation. OTOH all content on BioC can be considered well-linked,
we don't have dynamically generated content, and no dark corners of non-linked stuff we'd want to be found. In that case, removal of a broken sitemap.*
is not a loss.
https://support.google.com/webmasters/answer/156184?hl=en&topic=8476&ctx=topic has more information when a sitemap is needed or not.
Yours, Steffen
While a sitemap is not necessarily essential for the likes of Google who have "unlimited" resources to follow links and hopefully traverse a whole site, it is more difficult for others to do the same. For example, we have started scraping Bioschemas content but do not have the resource to do a full web crawl for it so are reliant on sitemaps.
https://www.bioconductor.org/sitemap.xml gives
from https://github.com/Bioconductor/bioconductor.org/blob/master/content/sitemap.xml There was a suggenstion in a discussion with @egonw about Add a
sitemap.xml
summarising site content to crawlers including google et al and TeSS Yours, Steffen