Open jqnatividad opened 6 years ago
Sitemaps have many uses outside of google dataset search. This should probably belong in core rather then ckanext-dcat, with an interface for extensions to:
If we have CKAN create a dataset for the sitemap, and upload the sitemap index files via the normal resource mechanism then you would get S3/azure/etc... for "free" when using ckanext-cloudstorage along with activity tracking and everything else.
See previous work by data.gov https://github.com/GSA/data.gov/issues/769, https://github.com/GSA/data.gov/issues/798
It might still be useful to implement a provenance-extended sitemap version in ckanext-dcat, for populating properties like sameAs
, isBasedOn
, etc. in the sitemap file so Google can use this info to identify the canonical dataset in its graph.
https://developers.google.com/search/docs/data-types/dataset#source-provenance
To help Google Dataset Search index the catalog
https://developers.google.com/search/docs/data-types/dataset#sitemap
see https://github.com/ckan/ideas-and-roadmap/issues/220