department-of-veterans-affairs / va.gov-team

Public resources for building on and in support of VA.gov. Visit complete Knowledge Hub:
https://depo-platform-documentation.scrollhelp.site/index.html
282 stars 202 forks source link

[Feature] Discover 4142 silent failures (since last remediation) #94374

Closed lisacapaccioli closed 1 week ago

lisacapaccioli commented 1 week ago

4142 historical silent failures were included in the remediation of the code yellow 3 project up to a specific date. Not sure the date, but it's findable in slack/tickets. Then we deployed the failure email. But there is a gap between the day we pulled the CY3 remediation files and the date we deployed the email, and we likely need to investigate if all of the failures between that time were either resubmitted or if any were still silent. So that's a discrete task.

The outcome of this phase should be an artifact that shows when logs are available, and what information we have, and volume counts.

The discovery work should solve for these:

Notes See: CY3 knowability table and log timeline for an example of the type of information we'll need. Can be a doc, table, diagram, or a combination.

This will help VBA make decisions on what to do with the data we have. We likely won't be resubmitting anything on our own, directly into the efolders. Why? Because without an EP, no one will know to look at and re-review it. This process will mirror other code yellows in that we'll be creating spreadsheets, delivering documents, and working with VBA to take action.

lisacapaccioli commented 1 week ago

Scott 10/9/24: I believe the historic 4142 failures were recorded for the time period 5/15/19 to 1/23/24.

freeheeling commented 1 week ago
job4142 = 'SubmitForm4142Job'
start_date = Date.new(2024, 1, 22)
js = Form526JobStatus.where(job_class: job4142).where.not(status: %w[success try]).where('updated_at >= ?', start_date)
js.size
=> 1
js[0].status
=> 'exhausted'
js[0].updated_at
=> 'Mon, 22 Jan 2024 17:32:23'

Datadog search for "Submit Form 4242 Retries exhausted" for past 1 year yields 5 results, all accounted for in previous report of historic 4142 failures, and the most recent of which the submission ID matches the 1 'exhausted' Form526JobStatus record above.

Both results support the conclusion that there have not been any silent failures of form 4142 since the initial discovery work [1][2] that analyzed all log records up to January 23, 2024. Neither have there been any failure emails dispatched since September 10, 2024, when the corresponding mailer notification was enabled.