NCIOCPL / cgov-digital-platform

The Cancer.gov Digital Communications Platform
GNU General Public License v2.0
11 stars 33 forks source link

There are non-viewable pages appearing in the sitemap (403/404 status) #3018

Open bryanpizzillo opened 3 years ago

bryanpizzillo commented 3 years ago

Issue description

A number of pages with a /media/<id> and 404 status are appearing in the sitemap. These are either media items that should not have a URL on the front-end (e.g. cgov_image and cgov_contextual_image), OR they are items that are not published, which is odd. Either way, there should be no 404s in the Drupal-generated sitemap.xml.

ESTIMATE TBD

Steps to reproduce the issue

  1. Go to https://www.cancer.gov/sitemaps/pageinstructions.xml
  2. See the bad URLs

What's the expected result?

What's the actual result?

andyvanavery31 commented 6 months ago

Determine if there is a patch for the site map. The site map is generated by the simple site map generator, so if there is a patch it would be to not list unpublished content.