VRO 1.0 Database design

yoomlam commented 2 years ago

Issue to discuss DB entities for VRO 1.0.

There should be no PII stored in the DB. The ATO must be updated and re-approved for VRO to store PII.
All records must have a unique uuid primary key that is auto-populated when a new record is created.
All records have an auto-populated created_at field value.
Consider including an updated_at field to reflect when the record was last changed.
Availability of AWS RDS resources in the Lighthouse DI and go about requesting them.

Below are DB tables (representing relevant entities) and their DB columns.

Claim

id: claim identifier used by client
id_type: domain of the id, e.g. "va.gov-Form526Submission" (Form526Submission table in va.gov's DB)
- for future versions, expect "VBMS" (a.k.a. "Benefit Claim ID")
incoming_status: only expect a "submission" value for VRO 1.0
- for future versions, expect established (for claim establishment notifications) and contention_mod (for contention update notifications)
updated_at: e.g., to capture when incoming_status changed
veteran_uuid: foreign key to Veteran table

Veteran

Expecting only 1 veteran per claim

icn (unique): Internal Control Number; needed for queries to Lighthouse Health API
participant_id: common identifier used by BGS
- BGS uses the participant_id also for non-veteran claimants (e.g., spouse or child) but VRO only needs the veteran's identifier
(maybe) edipi: identifier used by DoD
any other common identifiers for a veteran

Contention

A claim can have multiple contentions

claim_uuid: foreign key to Claim table
diagnostic_code: whole number in the thousands; https://www.hillandponton.com/read-va-diagnostic-code/

AssessmentResult

A new record is created each time the assess_claim endpoint is called and a non-error response returned.

claim_uuid: foreign key to Claim table
contention_uuid: foreign key to Contention table
evidence_count: number of evidence data points found to support fast tracking the claim

EvidenceSummaryDocument

A new record is created each time the generate_summary_doc endpoint is called and a non-error response returned.

claim_uuid: foreign key to Claim table
contention_uuid: foreign key to Contention table
evidence_count: number of evidence data points found to support fast tracking the claim
document_name: filename of document

dimitri-amida commented 2 years ago

Requirements questions:

Is cliam_uuid redundant in "AssessmentResukt" and "EvidenceSummaryDocument"? It already has a contention_uuid and the Contention is linked to claim.
Does the claim, when it reaches the service, already contain assessments and evidence, or are these added later on?
If a claim arrives for a veteran that's already in the DB, I assume we attach the claim to the existing record

Technical questions:

what schema do we use? Same as the rest of the app, or a schema specific to this module?
How is this module called from services?
What is the format of the input to this module?
Does this module need to provide an API for accessing the data? If so, is it REST?

yoomlam commented 2 years ago

Requirements questions:

Is cliam_uuid redundant in "AssessmentResukt" and "EvidenceSummaryDocument"? It already has a contention_uuid and the Contention is linked to claim.

It is. Here's my reasoning but feel free to remove claim_uuid: Since a Claim is a primary entity being passed around at the VA, may things are associated with a Claim rather than a Contention. I included claim_uuid for faster query performance, i.e., if VRO ever wants to query all AssessmentResult records associated with a Claim. This may be pre-optimizing, so you can remove it and we can always add it later if needed.

Does the claim, when it reaches the service, already contain assessments and evidence, or are these added later on?

Great question! Somewhat complex answer:

For VRO version 1.0, when RRD (from va.gov) submits a claim to VRO, the claim typically will not have previous assessments or evidence. However, if RRD decides to resubmit a claim to VRO for re-processing (due to some RRD failure, RRD retry mechanism, or simply wanting an updated assessment), then there may be a previous assessments or evidence.
For later versions of VRO, VRO will be notified about changes to Contentions, in which case there is a possibility that a Claim may have previous assessments or evidence due to a different previous Contention on the same Claim.

If a claim arrives for a veteran that's already in the DB, I assume we attach the claim to the existing record

Yes! Veterans can and often submit multiple claims.

Technical questions:

what schema do we use? Same as the rest of the app, or a schema specific to this module?

Not sure I understand where you're coming from. I was expecting a single schema be used for all of VRO. I'm not familiar with "a schema specific to this module" and what that implies.

How is this module called from services?

What is the format of the input to this module?

Does this module need to provide an API for accessing the data? If so, is it REST?

To be discussed and decided by the microservices developers, who will be reading and writing to the DB.

yoomlam commented 2 years ago

I'd like to present another option for the team's consideration. Instead of a REST interface to the DB (may be more overhead than needed since only VRO will be accessing the DB) or a language-specific DB driver (we'd have to maintain a DB driver for each language that a microservice is written in), we can leverage Apache Camel (such as Camel's JDBC component) if certain constraints are satisfied, for example if microservices don't need to read/write to the DB except at the beginning (input) or end (output) of their processing. An advantage of this option is that microservices don't need to directly access the DB, and hence less coupling, which is good when/if we change DB implementation.

If desired, we can discuss more about this option at our meeting with microservices developers.

dimitri-amida commented 2 years ago

Here are some additional thoughts

I don't think that a call to DB driver, such as an INSERT, is going to work for the requirements. There is some business logic involved in this process, albeit minimal: If a veteran exists, a claim must be attached to them. If a claim exists, it must be updated with additional contentions etc.
The use of camel, allows some flexibility in the ways this module can be invoked, so we can allow several ways. If it is called from a Java service, then it can be a direct (same JVM) async call. If it is called by a Python service, it can be via a queue.
If we want to maintain a history of updates to every claim, then the current database design is insufficient, since it only shows the latest version. The "last updated date" may tell us that there have been updates, but we may not be able to tell what they were.
Should we save some identifier of the service calling save-to-db in order to be able to tell where it came from?

yoomlam commented 2 years ago

If we want to maintain a history of updates to every claim, then the current database design is insufficient, since it only shows the latest version. The "last updated date" may tell us that there have been updates, but we may not be able to tell what they were.

Agree! Auditing is not required for VRO 1.0, but it will be useful for reporting as described in the Roadmap wiki page:

claim_stats_Endpoint (low priority): Add claim_stats endpoint to API for reporting and monitoring

Maybe have a separate table called something like ClaimEvents with an event_type field (with possible values "new_claim", "new_contention", "contention_updated", etc.). A new record is created when the assess_claim endpoint is called, so I expect this table will become large, which is okay since it will only be for diagnostics and reporting.

Should we save some identifier of the service calling save-to-db in order to be able to tell where it came from?

If I'm interpreting this properly, I'm not sure we need that level of granularity stored in the DB. Couldn't we get this from regular logs, i.e. the save-to-db logs would occur immediately after the service logs?

yoomlam commented 2 years ago

We've decided to move forward with using Camel to add a step to read from (and possibly write to if the record doesn't already exist) the DB to inject data into the input of a contention processor (e.g., hypertension processor). This will decouple the processor microservice from having to interact with the DB. The output of the microservice will include some JSON fields that will result in writing some record (e.g., AssessmentResult) to the DB, so there will be a new workflow step after each processor to write to the DB. I expect adding these new DB interaction steps will occur in the routes defined in ClaimProcessorRoute and PrimaryRoutes.

department-of-veterans-affairs / abd-vro