SamStuckey commented 1 year ago

Product Outline

High Level User Story

As a developer, I have insights into our applications discreet submission actions.

This work is the first of two steps to provide general clarity into our 526 form flow. This step will generate information. In the next step (out of scope for this epic) we will (better) organize that information into actionable data via Sentry Dashboards and DataDog(?)

Current:

We have identified a list of discrete submission actions inside our 526 form flow that represent potential "black holes" for lost information. This problem was first identified in regards to overall 526 form submissions being "lost". A "paper" submission failover solution has been added to reduce our number of failed submissions, but we still have many other discrete un-or-under reported possible points of failure in the 526 app flow.

Future:

We have clear action logging around discrete submission actions. The output of this work will enable us to create more meaningful dashboards.

In scope

Wrap KPIs in logging, send the information to our Log files (Rails.logger)

Out of scope

Organizing this information into actionable data in Sentry / DataDog dashboards.

Hypothesis

If we make these changes, the enhanced visibility into the "under the hood" actions of our 526 app we will create new insights and actionable data, which in turn will allow us to iterate on opportunities including bounce rate investigations, general debugging, application health monitoring, and improved failure notifications.

Definition of done

Each of the KPIs (Key Point of Interest) listed below have been investigated and (if required) reinforced with the appropriate action logging and error handling. Changes and findings are documented:

Documentation

Use the following document to track research into each KPI / ticket.

We want the following Metrics/Data for each KPI action:

User_uuid
Action being performed (Form ID or action description)
Up or Downstream system involved
If a retry-able action
- Attempt counter
  - If will be retried or if it is the final attempt
Success or failure status
Http status from up/downstream service
Http response body if NOT a 200 (successful) response
Any ID that is the result of a creation, update or delete (IDs returned from third-party services, internal VA.gov DB record IDs)
Exception message/stack-trace, if fails outside/before/after external call
Duration

Tips for workflow

The assigned dev should ask themselves;

"What are all the possible things my thing can do?"
"What do I need to know about those things?"

the desired result of this information, once its organized in a dashboard is to be able to look at it and see at a glance, "is my 'thing' healthy?" If the answer is NO, we want every possible bit of data that could help investigate.

Historical Context

This Epic Spawned this work which resulted in these two tickets for hand off

It was determined that ticket #2 should / could be done in an iteration after we have a clear target for what we want our sentry dashboards to look like.
Ticket 1 lists several versions of that same dashboard work relative to Sentry and DataDog. That work was determined to be out of scope as documented here Research related to the hand off of the error logging work is documented here which in turn led to the creation of this document for planning the ongoing work

TL;DR

All of this (above) historical work, ticketing, and research is either encapsulated by the action items outlined in this epic, or slated for an iteration, e.g. Dashboard refinement.

KPIs

Each of these discrete actions is represented by a ticket in this epic. Each ticket will require at least investigation and documentation, as well as possibly (probably) code enhancements to add appropriate logging.

[x] Overall 526 Submission
[x] Evidence Upload Retrieval
[x] Evidence Uploads
[x] Form 4142(a)
[x] Form 21-0781(a)
[x] Form 21-8940
[x] BDD Instructions
[x] Flashes
[x] Backup Submission (back-up path)
[x] Evidence Upload Retrieval (back-up path)
[x] Complete 526 PDF from EVSS (back-up path)
[x] Intent to file at the beginning of the form
[x] PDF generation (not sending) of files locally on the filesystem
[x] Pre-fill of user information
[x] Evidence UPLOAD to s3

SamStuckey commented 1 year ago

~~TODO - add individual tickets with context for each KPI~~ (DONE)

SamStuckey commented 1 year ago

~~TODO - clean up individual ticket context.~~ (DONE)

department-of-veterans-affairs / va.gov-team

526 Logging and Error Reporting #60952

Product Outline

High Level User Story

Current:

Future:

In scope

Out of scope

Hypothesis

Definition of done

Documentation

We want the following Metrics/Data for each KPI action:

Tips for workflow

Historical Context

TL;DR

KPIs