department-of-veterans-affairs / va.gov-cms

Editor-centered management for Veteran-centered content.
https://prod.cms.va.gov
GNU General Public License v2.0
99 stars 69 forks source link

[Zero Silent Failures SPIKE] VA.gov home page "GovDelivery" #19242

Closed FranECross closed 1 month ago

FranECross commented 1 month ago

Description

The following feature needs to be evaluated to determine if it meets the standards for 'zero silent failures': VA.gov Homepage > email to GovDelivery, which is a user-facing transaction that is submitted to the back-end system. If we identify any missing monitoring, etc. from evaluating the checklist, we will file tickets to update implementation.

OCTODE guidance states:

Problem Statement:

Artifacts

User story

AS A I WANT SO THAT

Engineering notes / background

If you need to set up monitoring in DataDog:

Set up monitoring in Datadog

Follow this guidance on endpoint monitoring to get going. Then following the guidance on monitoring performance to get up to speed with Datadog.

Examples

Additional examples

Analytics considerations

Quality / testing notes

Acceptance criteria

Checklist

Start

Monitoring

⚠️ Failure to have endpoint monitoring in place is a blocking QA standard at Staging review as of 9/10/24. If you answered no to any of the questions above, you will be blocked from shipping at the Staging review touchpoint in Collab Cycle.

Reporting errors

Documentation

User experience

Learn how to create a user data flow diagram

File silent errors issues in Github

We don't have any silent errors!

Great! Please let us know that you went through the checklist above as a team and did not find any silent failures in our Slack channel: #zero-silent-failures. You don't have to hang out in there once you have notified us. Just pop in, tell us who you are (which team and in which portfolio) and that no failures were found. Thanks!

randimays commented 1 month ago

@FranECross I've updated the checklist in the ticket description with some information regarding this feature and my recommendations on where we should go from here.

There is one item in the ticket description that I wasn't sure what to do with:

Has the owner of the system of record receiving the user's data indicated in writing that their system notifies or resolves 100% of fatal errors once in their custody?

I am pretty sure the answer to this is "No," but I'm not sure how to verify. We have plenty of other products across the org that use GovDelivery in more complex ways, but I couldn't find any documentation anywhere that outlines GovDelivery error handling for our specific use case.

Otherwise, I've gathered all the info below for easier readability and clarity on probable next steps.

Email signup on the homepage

Form setup

The code for the email sign up form at the bottom of the home page lives in content build in email-update-signup.drupal.liquid as a raw HTML form. The <form> element uses a POST HTTP request to submit several values to GovDelivery at this URL:

https://public.govdelivery.com/accounts/USVACHOOSE/subscribers/qualify

The data we send via this POST request is:

Screenshots

Screenshot 2024-09-23 at 3 58 27 PM
Screenshot 2024-09-23 at 3 58 27 PM

Error handling

The email address form field does not validate user input and can be submitted even if the field is blank. In that case, the user is redirected to the GovDelivery URL with an error at the top of that page that says “Can’t save subscriber because of the following 2 errors.” The user can then enter their email address and proceed with signup successfully.

Once the GovDelivery page is reached, there is some validation on the email that is passed through, but we don't have any handling on our side to prevent the submission.

Screenshot 2024-09-23 at 3 58 27 PM

If the utf8 and category_id values are missing from the POST request but the email address is filled out, the form still successfully redirects to the GovDelivery URL as though those values are present.

In short, it's hard for the client side (front end) to create a bad request to send to GovDelivery without malicious interference or some strange network / routing issue.

What happens if the GovDelivery URL gets changed?

If an engineer inadvertently changes the GovDelivery URL the form is pointing to, or if the GovDelivery page itself moves or is otherwise taken down, the user will still be taken to that broken page when the form is submitted from VA.gov.

Screenshot 2024-09-23 at 3 58 27 PM

Scenarios I tested in production

All of these assume the GovDelivery page / services are working as expected on their own.

The only scenarios that can actually be affected by the user are scenarios 1 and 4. The other two scenarios would result from a data transmission error via the POST request or some other kind of client-side error.

1. Email address, utf8 and category_id values are all sent

2. Only email address is sent; no utf8 or category_id

3. No email address is sent (field is blank), no utf8 or category_id

4. No email address is sent (field is blank), but utf8 or category_id are included

Next steps

I created two tickets to start with:

  1. Create monitoring and add a synthetic in Datadog for this flow: https://github.com/department-of-veterans-affairs/va.gov-cms/issues/19312 (front end work)
  2. Create an error state UX for this flow: https://github.com/department-of-veterans-affairs/va.gov-cms/issues/19314 (UX work)

The error handling code implementation is going to depend on how we want to handle the errors from a visual perspective. We should definitely move the email signup form out of content-build and into vets-website as a React widget. This will give us more flexibility making use of field validation and error handling while allowing us to dynamically add or remove elements in error states.

We will need a ticket for any code changes to this form once we agree as a team how to handle the errors both in concept and UX.

@jilladams @dsasser @chriskim2311: feel free to weigh in here about thoughts re: next steps.

@FranECross I'll assign this ticket to you as I think I'm finished with it for the purposes of the audit; please let me know if I missed anything.

FranECross commented 1 month ago

@randimays Thanks so much! I'll see if I can track down who might be able to provide either documentation or direction regarding GovDelivery error handling for our specific use case. Thanks for all your hard work on this. I'll review and take next steps.

jilladams commented 1 month ago

@randimays could you move your Mural flow into the Homepage room here: https://app.mural.co/t/departmentofveteransaffairs9999/r/1568899091707?folderUuid=e23f6282-0491-4c5e-89e2-cb756dfa5f27

randimays commented 1 month ago

@jilladams done. Thanks!

randimays commented 1 month ago

@FranECross Just doing an end-of-sprint check: is there any action I need to take on this ticket?

FranECross commented 1 month ago

@randimays No other actions for you on this ticket. Okay to close.

randimays commented 1 month ago

Closing per Fran's comment above.