CDCgov / prime-reportstream

ReportStream is a public intermediary tool for delivery of data between different parts of the healthcare ecosystem.
https://reportstream.cdc.gov
Creative Commons Zero v1.0 Universal
71 stars 39 forks source link

Resend/Reprocess Functionality #4580

Open TomNUSDS opened 2 years ago

TomNUSDS commented 2 years ago

Feature initially for Admins only, eventually allow Orgs to do this action as well.

See user story details :lock: from @MauriceReeves-usds

Sub-Task:

Simplified overview of Report Stream

Basic Resend (Reforwarding)

Solves Issue: A Receiver was unable to get the data. Maybe an authentication issue or service down.

Solution: No data needs to be changed. Just and event triggered to retry a send to a specific receiver (or set of receivers).

Note: This function exists today in the command line tool/API but does not seem to be working.

Reprocess Resend approach 1 (as of today)

Solves Issue: A Sender's data needs minor fixing before it can be processed.

  1. FIND data that needs fixing.
  2. Download file(s) to local machine
  3. Extract and fix test results as individual lines (csv)
  4. Upload back into bucket similar to how as sender would do. This triggers the rest of the process to run as usual.

Reprocess Resend approach 2 (as of today)

Solves Issue: A Sender's data needs major fixing before it can be processed. Filters need to be modified to re-run.

Solution:

  1. FIND data that needs fixing.
  2. Download data file(s) to local machine
  3. Modify data files (sometimes just some lines).
  4. Sync filters on localhost with production so they are consistent.
  5. Run filters locally to make sure output is correct.
  6. Upload output
  7. Trigger forward.

Reprocess Resend approach 3 FULLY MANUAL (as of today)

anshulkumar-usds commented 2 years ago

Duplicate issue: https://app.zenhub.com/workspaces/onboarding--operations-6166edbd409257001e09f1b2/issues/cdcgov/prime-reportstream/4306

TomNUSDS commented 2 years ago

Update: @clediggins-usds is helping investigate this feature. Resubmission needs some features (like a unique id to tie to a record) and some deep thought (like how to link a resubmission back to the original for audit purposes).

jimduff-usds commented 2 years ago

A minor FYI: The Basic Resend says "may result in multiple reports being generated to difference receivers".

That is not the case. The api/requeue/send capability only ever sends one file at a time and it never generates data - it merely grabs an existing report (that must be the output of the Batch step) and attempts to re-send it.

Current syntax for calling it as of this writing is: curl -X POST -H "content-length:0" -H "x-functions-key:<SECRET>" "https://prime.cdc.gov/api/requeue/send?reportId=861f31ec-276c-489c-a733-b55f701bcb4a&receiver=nm-doh.elr" The secret is for the requeue azure function; NOT for any of the regular submission endpoints.

Because it operates within the existing code, any data re-sent gets correctly linked in the auditing tables.

jimduff-usds commented 1 year ago

A summary of our current working proposal for REPROCESS:

Simplifying slightly, there were two main tools required:

  1. A tool to look at the data, without having to download it as a file.
  2. Then, some sort of ability to pick and choose “rows” in the data to re-send, then click “go” to resend them.

An important requirement is for resends to happen within the framework of the normal lineage tracking, so that we have a record that a particular item was in fact sent, albeit not on the first try.

The above process meets use cases where the problem is ReportStream’s fault (eg, ‘process’ step fails, or we had an erroneous filter) - we fix the problem, then do the resend. The above process presumes that if incoming data itself must be corrected, that’s our customer’s job to fix, not ours.

An MVP version of the above steps would be to only show database data, and not actual file data. Then (1) and (2) could be accomplished by only looking at the rows in the covid_result_metadata or other metadata tables (that is, by only looking at the non-PII) - then this work could be greatly simplified.

jimduff-usds commented 1 year ago

A further possible simplification: The receive step take a routeTo=org.receiver URL parameter which basically only allows the data in that submission to go to one receiver.

Instead of us creating a complicated Chooser UI, the reprocess feature could simply allow O&O to pick the one recieve they want to "fix", and use the routeTo feature to ensure only the data goes to that one receiver.

The routeTo works the way you'd expect it to: Example:

brick-green commented 1 year ago

Oh good call out on the routeTo option. Being able to select the receiver would need to be part of the MVP.

brandonnava commented 1 year ago

@jimduff-usds move to platform board?

TomNUSDS commented 1 year ago

Related: #6506