ioos / ckanext-ioos-theme

IOOS Catalog as a CKAN extension
GNU Affero General Public License v3.0
7 stars 14 forks source link

Re-enable sitemap.xml update process from CKAN database #241

Closed mwengren closed 7 months ago

mwengren commented 1 year ago

A standard sitemap.xml (https://data.ioos.us/sitemap.xml) is needed in order for Google Dataset Search to index the IOOS Catalog.

The ckanext-sitemap extension that did this previously has not been updated for CKAN 2.9.

Ben developed workaround to query database manually to generate instead of using plugin.

May depend on resolving GliderDAC/data.ioos.us DNS mirroring and/or Glider DAC FTP host issues to be resolved before implementing. Glider DAC community still using data.ioos.us to FTP files in some cases.

mwengren commented 1 year ago

This is still affected by Glider DAC DNS problem. Options are to either enable a proxy or wait for the domains to be separated.

mwengren commented 1 year ago

@benjwadams Has the Glider DAC domain (gliders.ioos.us) been isolated from the Catalog domain (data.ioos.us) in DNS yet?

We agreed with @kerfoot originally to make this change on 6/19. We need to follow up about making the DNS change if it hasn't happened already. Once that's completed, then we can restore the sitemap.xml update process, hopefully.

mwengren commented 10 months ago

https://data.ioos.us/sitemap.xml file now restored. Updates every 6 hours currently from datasets in database.

mwengren commented 8 months ago

@benjwadams I noticed this morning the sitemap.xml disappeared again. Can you look into it?

https://data.ioos.us/sitemap.xml

mwengren commented 8 months ago

Let's ensure that if there's a job failure on the update job that the existing sitemap.xml file is retained on the server so a 404 isn't displayed as it is currently.

mwengren commented 7 months ago

sitemap.xml looks to be available and updating again. The IOOS Catalog should be able to be harvested again by downstream catalogs like ODIS, Google, etc.

cc @benjwadams @MathewBiddle