department-of-veterans-affairs / va.gov-cms

Editor-centered management for Veteran-centered content.
https://prod.cms.va.gov
GNU General Public License v2.0
99 stars 68 forks source link

Discovery: Audit sitemap.xml for all CMS-driven pages (anecdote: Resources & Support articles missing) #10784

Closed wesrowe closed 2 years ago

wesrowe commented 2 years ago

Description

It was observed around 9/20/22 that some CMS-driven pages are missing from sitemap.xml. The initial anecdotes were about R&S pages, which were not performing as expected in site search. But the same root cause could affect other types of pages.

We need to figure out how sitemap.xml is being generated from the CMS and why some pages are missing.

(NOTE: there is a competing SiteMap.xml that may be problematic, but the Platform Crew is tasked with looking into that.)

Slack threads pertaining to the two competing sitemap.xml files:

Acceptance Criteria

CMS Team

Please check the team(s) that will do this work.

wesrowe commented 2 years ago

@EWashb @swirtSJW - Dave asked me to create this issue and pass it to you. R&S is our product but the issue is very CMS.

swirtSJW commented 2 years ago

Do we have actual examples of pages missing from the site map? That one thread took a weird turn where the page of concern was in the sitemap.

ndouglas commented 2 years ago

I was curious about this, so I took a peek.

I started off by running the following query to get the (node) pages with paths starting with /resources/:

SELECT * FROM path_alias WHERE alias LIKE '/resources/%' AND path LIKE '/node/%';

That returns 316 rows, so we'd seem at first glance to have 316 nodes that should appear in the sitemap under /resources.

I found 84 in the actual sitemap (excluding tag pages and a couple other seemingly irrelevant pages), which sounds a bit alarming.

So I checked to see if I could figure out how many we should actually have, by querying how many nodes we have with a corresponding path where the node is published.

SELECT * FROM node_field_data INNER JOIN path_alias ON path_alias.path=CONCAT("/node/", node_field_data.nid) WHERE node_field_data.status=1 AND path_alias.alias LIKE '/resources/%';

That returns 230 rows, but I see that many or most of them are q_a type, which IIRC don't have their own page.

So filtering those out:

SELECT * FROM node_field_data INNER JOIN path_alias ON path_alias.path=CONCAT("/node/", node_field_data.nid) WHERE node_field_data.status=1 AND path_alias.alias LIKE '/resources/%' AND type !="q_a";

We get back 64 rows.

If I check those against the actual URLs I found in the sitemap:

SELECT * FROM node_field_data INNER JOIN path_alias ON path_alias.path=CONCAT("/node/", node_field_data.nid) WHERE node_field_data.status=1 AND path_alias.alias LIKE '/resources/%' AND type !="q_a" AND path_alias.alias NOT IN ("/resources/how-to-get-a-premium-ds-logon-account-online", "/resources/how-to-download-and-open-a-vagov-pdf-form", "/resources/how-to-file-a-va-travel-reimbursement-claim-online", "/resources/how-to-set-up-direct-deposit-for-va-travel-pay-reimbursement", "/resources/how-to-change-your-address-in-your-vagov-profile", "/resources/how-to-change-direct-deposit-information-for-va-benefits", "/resources/how-to-check-your-va-claim-appeal-or-decision-review-status-online", "/resources/how-to-check-in-with-your-smartphone-for-some-va-appointments", "/resources/how-are-pension-benefits-and-disability-compensation-different", "/resources/does-va-cover-nursing-home-assisted-living-or-other-long-term-care", "/resources/can-i-get-free-health-care-and-prescriptions-as-a-veteran", "/resources/whats-a-veteran-health-id-card-vhic-and-how-do-i-get-one", "/resources/will-i-have-to-pay-back-the-gi-bill-benefits-i-used-if-i-fail-a-class", "/resources/should-i-create-a-logingov-or-idme-account-to-sign-in-to-vagov", "/resources/are-service-dogs-allowed-in-va-facilities", "/resources/what-if-i-dont-want-a-fiduciary-anymore", "/resources/what-if-my-school-closes-temporarily-because-of-a-natural-disaster", "/resources/can-i-get-a-replacement-gi-bill-benefit-certificate-of-eligibility", "/resources/how-do-i-get-college-credits-for-my-military-service", "/resources/my-healthevet-faqs", "/resources/veteran-identification-card-vic-faqs", "/resources/commissary-and-exchange-privileges-for-veterans", "/resources/claim-status-tool-faqs", "/resources/gi-bill-wave-faqs", "/resources/government-headstones-and-markers-faqs", "/resources/connected-apps-faqs", "/resources/ds-logon-faqs", "/resources/direct-deposit-for-your-va-benefit-payments", "/resources/managing-your-vagov-profile", "/resources/how-va-education-benefit-payments-affect-your-taxes", "/resources/waivers-for-va-benefit-debt", "/resources/submitting-a-financial-status-report-va-form-5655", "/resources/va-debt-management", "/resources/ask-va-replaces-iris-and-the-gi-bill-help-portal", "/resources/gi-bill-and-other-va-education-benefit-payments-faqs", "/resources/privacy-and-security-on-vagov", "/resources/verifying-your-identity-on-vagov", "/resources/signing-in-to-vagov", "/resources/your-intent-to-file-a-va-claim", "/resources/non-compensable-disability", "/resources/montgomery-gi-bill-refunds", "/resources/how-your-reason-for-withdrawing-from-a-class-affects-your-va-debt", "/resources/the-pact-act-and-your-va-benefits", "/resources/how-do-i-change-my-name-in-my-deers-record", "/resources/can-i-be-buried-in-arlington-national-cemetery", "/resources/what-does-burial-in-a-va-national-cemetery-include", "/resources/can-i-plan-ahead-for-my-burial-in-a-va-national-cemetery", "/resources/can-i-get-a-loan-through-my-va-life-insurance-policy", "/resources/how-can-i-find-a-va-facility", "/resources/how-can-i-stay-informed-about-covid-19-vaccines-at-va", "/resources/what-if-i-cant-sign-in-to-vagov-because-my-password-doesnt-work", "/resources/what-if-i-dont-have-a-bank-account-but-i-want-to-use-direct-deposit", "/resources/request-a-discharge-upgrade-or-correction", "/resources/find-apps-you-can-use", "/resources/requesting-a-replacement-government-headstone-or-marker", "/resources/how-to-get-free-language-assistance-from-va", "/resources/life-insurance-if-you-have-preexisting-conditions", "/resources/how-to-find-out-if-you-should-get-a-higher-tsgli-payment", "/resources/what-your-claim-status-means", "/resources/deciding-how-much-life-insurance-to-get", "/resources/how-to-get-help-with-concerns-at-a-va-health-facility", "/resources/reimbursed-va-travel-expenses-and-mileage-rate", "/resources/your-civil-rights-and-how-to-file-a-discrimination-complaint", "/resources/how-to-change-your-legal-name-on-file-with-va", "/resources/va-covid-19-debt-relief-options-for-veterans-and-dependents", "/resources/what-your-decision-review-or-appeal-status-means", "/resources/get-a-premium-my-healthevet-account", "/resources/evidence-to-support-va-pension-dic-or-accrued-benefits-claims", "/resources/in-state-tuition-rates-under-the-veterans-choice-act", "/resources/how-we-determine-your-post-911-gi-bill-coverage", "/resources/how-we-determine-your-percentage-of-post-911-gi-bill-benefits", "/resources/getting-a-gi-bill-extension", "/resources/choosing-between-urgent-and-emergency-care", "/resources/helpful-va-phone-numbers", "/resources/change-your-address-on-file-with-va", "/resources/combat-related-special-compensation-crsc", "/resources/the-pact-act-and-your-va-benefits-tag", "/resources/the-veterans-health-information-exchange-vhie", "/resources/how-to-get-your-medical-records-from-your-va-health-facility", "/resources/life-insurance-dividend-payment-options", "/resources/the-pact-act-and-your-va-benefits-esp", "/resources/monkeypox-information-for-veterans", "/resources/choosing-a-decision-review-option", "/resources/covid-19-testing-at-va");

We get an empty set.

So I'm not able to confirm that we have resource and support articles (or any related type, minus Q&A which again, IIRC, aren't supposed to be there) in the CMS that are missing in the sitemap. There totally might be a flaw in my approach, though, as I'm a well-known foole and I haven't yet tapped any caffeine sources.

wesrowe commented 2 years ago

@swirtSJW - no, Dave didn't provide specific examples. He may have been under the mistaken impression that the PDF download article was missing.

EWashb commented 2 years ago

@wesrowe our team didn't find anything missing on the sitemap. Spoke with Dave and he would like me to pass this back to you. The issue they are encountering is not the result of sitemap, however, it may have something to do with the accuracy of the search within R&S

wesrowe commented 2 years ago

Thanks for the quick investigation, @ndouglas. I'm closing this ticket and will write a new one for specifically fixing R&S search results

wesrowe commented 2 years ago

amending that – the next issues are already written: #10755