department-of-veterans-affairs / caseflow

Caseflow is a web application that enables the tracking and processing of appealed claims at the Board of Veterans' Appeals.
Other
53 stars 18 forks source link

How should we handle unique identifiers for AMA and legacy appeals? #5295

Closed amprokop closed 6 years ago

amprokop commented 6 years ago

Context

We'll soon be creating the AMA data model, and we'll need a way to distinguish between AMA and legacy appeals. Currently, we use the vacols_id in lots of routes (e.g. Reader users navigate to a claims folder using /reader/appeal/:vacols_id). Before we introduce a new Appeal table, we'll want to decide on an approach to handle appeal lookups across tables and when using parameters in our HTTP routes.

Option 1: Join table for both AMA and legacy appeals with a unique ID

We'd create an AppealIds table that is essentially a join table to give us a unique id across brieff.bfkey and the AMA appeals table. We'd resolve to only use this join table to look up appeals.

Pro

We'd have unique IDs managed by Caseflow in our DB for both legacy and AMA appeals.

Con

We'd have to refactor all of our existing routes to use that id. We'd have to be disciplined and enforce only doing appeal lookups using our join table, and reads/writes would be slower since the join table would have to be maintained, and every appeal lookup would necessitate a join.

Option 2: Set an offset between the current brieff.bfkey and the first AMA appeal id to avoid collisions

There is a limited timeframe during which legacy appeals can be created and thus brieff.bfkey values assigned. We would pick a number large enough that it would never be assigned as a brieff.bfkey, and count up from there for AMA appeals. Confirmed this approach with Jed: (https://github.com/department-of-veterans-affairs/dsva-vacols/issues/1)

The SQL statement would look something like: ALTER SEQUENCE appeal_id RESTART WITH 7000000;

Pro

It would preserve backwards compatibility of all of our routes, and is potentially the least complex/most forgiving option. For instance, our LegacyAppeal class could recognize if you're querying for an id that is higher than the highest legacy appeal id, and throw an error/return an AMA appeal instead.

Also, we leave open the opportunity to teach VA employees how to distinguish between types of appeals by ID.

Con

Perhaps not the most elegant approach. If we were to choose the wrong AMA starting id and legacy ids/ama ids collided, it would be extremely difficult to unwind.

Option 3: Add "A" or "L" to the end of ids e.g. 343242Lor 34224A

This is essentially the approach VACOLS uses to distinguish between VBMS ids that are SSNs and C numbers.

Pro

Type of appeal is transparent from every URL.

Con

We'd have to refactor all existing routes. This creates a new source of bugs i.e. VBMS/VACOLS veteran id translation — we'll have to remember to strip the extra character/append it every time we're running queries or displaying ids. In the VBMS id context, this has caused a fair amount of confusion/bugs.

Option 4: Don't have unique ids, instead in all routes require an extra query param e.g. ?appeal_type=AMA

Pro

Type of appeal is transparent from every URL.

Con

Creates a new source of bugs. Makes every route with an appeal id less readable/more complex. We'll have to add query param logic everywhere. Related bugs could be really bad, as we'd be operating on the wrong appeal.

cmgiven commented 6 years ago

I think I lean toward option 1. I don't think 2 is necessarily bad, although I can imagine disaster scenarios that could result in much higher numbers being necessary than we would naively expect (CAVC throwing out RAMP, e.g.).

Mostly chiming in on option 3, though, to note that BFKEYs can already sometimes be appended with the following characters: B, D, L, N, P, R, W, X, Z. Suggest using only characters for this disambiguation that are not presently used as disposition codes (look at VACOLS::Case.DISPOSITIONS).

evankroske commented 6 years ago

Option 3.1: Add "AMA" to the end of AMA IDs

We would update our existing routes to distinguish between legacy and AMA IDs and query the corresponding DB.

Pro

Con

anyakhvost commented 6 years ago

I am in preference of option 4. To start with we can only require to pass an appeal_type if is an AMA appeal. If the appeal_type is missing, we will default to the Legacy appeal, this way we don't have to re-factor existing routes.

I think we need to distinguish between FE URLs (whatever a user is seeing) and BE routes.

BE

I only see vacols_id in certifications and queue routes when I run:

bundle exec rake routes | grep vacols_id

Can we re-factor those to use appeal_id?

FE

Not super familiar with FE, but can we re-factor those to use appeal_id instead of vacols_id?

mdbenjam commented 6 years ago

I like @evankroske's approach. It seems the easiest as there is no refactoring required. It also is very clear in the URL what is going on.

As for the other options: 1) A join table adds a decent amount of complexity. We'll need to do some refactoring to make it all work. And, who wants more joins??? 2) I'm okay with this, but feels magic number-y to me. 4) I think a query parameter is super inelegant. All the information is no longer included in the appeal id. It's in two separate places: appeal id and query param. All of our controllers that deal with appeals will now need extra code to look for the query param.

We could also create a BaseAppeal model that Legacy and AMA appeals could inherit from. This base model could have a create_appeal method that determines which of the two to initialize based on the presence of 'AMA' in the appeal_id. They both can implement the set of methods they need for existing applications like Reader and Queue to work, but it should smooth the transition and make this change transparent to applications.

@aroltsch reader definitely uses vacols_id, it calls it appeal_id 🤷‍♂️ . Not sure if there are other places as well, but grep-ing for vacols_id doesn't guarantee finding them all.

amprokop commented 6 years ago

Leaving a few considerations to try to frame this conversation:

Minimize likelihood that an action is taken on the wrong appeal (super important)

Nightmare scenario: Let's say a user has to close an appeals. Due to a bug in our code, or a confusing ID scheme, they close the wrong one. The Veteran never gets notified of this, but waits for years before contacting VA. This is a really bad outcome. We should minimize the chance that this ever happens.

Prioritize a future when legacy appeals gracefully cease to exist

In just a few years, there will be many more AMA appeals than legacy appeals. It'd be preferable to ensure that AMA naming / lookups are no less simple and clear than legacy appeals.

Avoid breaking existing links/routes

Two reasons, neither critical— it will be substantial to refactor all our routes, but I don't have a strong sense that users rely on bookmarks a whole lot. The one exception is the Interactive Decision Template, which is now in use and links straight to Reader.

shanear commented 6 years ago

So options 2 and 3.1 all involve leaving legacy appeals as is, and creating a new set of identifiers for AMA Appeals that are:

Any of these options would satisfy these three concerns:

Minimize likelihood that an action is taken on the wrong appeal (super important) Prioritize a future when legacy appeals gracefully cease to exist Avoid breaking existing links/routes

So in light of this, the debate between these options is between the format of the AMA Appeal identifier:

2) A magic starting number that's larger than the count of legacy appeals we expect to have (7 million?) This one is problematic for a couple reasons: 1) I don't love that we can't easily distiguish AMA ids from legacy ids, 2) While unlikely, if the legacy appeal count were to somehow eclipse our chosen number, it would be a catastrophic Y2K style problem for us.

3.1) Starting with 1 and appending 'AMA' to the end of ids I don't see any significant downsides to this option. I think it suits our needs and is relatively simple.

2.1) (new!) UUIDs UUIDs are a widely supported concept and industry standard for identifiers. It allows for the creation of appeals in a distributed way (should we want to do that).

So like 3.1 it satisfies all our needs, while leaves us with more options with regard to smoother distributed creation of appeals. The fact that the UUID is not easily written on a sticky note might also be a feature not a bug in the sense that it would incentivize users to share appeals via links instead of with sticky notes, which is error prone no matter what kind of ID you use.

anyakhvost commented 6 years ago

@amprokop, @mdbenjam and I thought of a new approach.

Database changes

  1. Appeals table will be used for both AMA and Legacy appeals, and will keep its current name.
  2. We will move the special issues columns (into its own separate table special_issues
  3. We will introduce a new column to the appeals table: type (Legacy, Ama)
  4. Column vacols_id will stay in the appeals table and will only be used by the Legacy appeals to query VACOLS::Case table. We will remove the non-null constraint.
  5. We will add additional Ama columns to the appeals table.

Model changes:

  1. Appeal.rb will be the new AMA appeal model
  2. LegacyAppeal.rb will be the old appeal model

Notes

mdbenjam commented 6 years ago

Alex and I were just discussing the type field. We think this should just be docket_type with options like (Legacy | Hearing Docket | Evidence Docket | No Evidence Docket).

Another thing: https://www.postgresql.org/docs/9.0/static/indexes-unique.html. We can leave the uniqueness constraint, but remove the not-null constraint.

How to migrate Reader URLs: 1) Create a new URL: reader/appeal_id/:appeal_id which uses the appeal_id instead of the vacols_id. 2) Redirect the current route: reader/appeal/:vacols_id to the new route reader/appeal_id/:appeal_id

amprokop commented 6 years ago

@shanear @oscarAtNava @evankroske @sharonwarner @sunil-sadasivan , when convenient would you mind taking a look at the approach Anya detailed?

In our discussion, I/we overlooked the fact that the majority of the complexity/legacy logic is exclusively in the model layer. The appeals db table wouldn't be complex at all, if we move the long list of special issues columns:

"rice_compliance"
"private_attorney_or_agent"
"waiver_of_overpayment"
"pension_united_states"
"vamc"
"incarcerated_veterans"
"dic_death_or_accrued_benefits_united_states"
"vocational_rehab"
"foreign_claim_compensation_claims_dual_claims_appeals"
"manlincon_compliance"
"hearing_including_travel_board_video_conference"
"home_loan_guaranty"
"insurance"
"national_cemetery_administration"
"spina_bifida"
"radiation"
"nonrating_issue"
"us_territory_claim_philippines"
"contaminated_water_at_camp_lejeune"
"mustard_gas"
"education_gi_bill_dependents_educational_assistance_scholars"
"foreign_pension_dic_all_other_foreign_countries"
"foreign_pension_dic_mexico_central_and_south_america_caribb"
"us_territory_claim_american_samoa_guam_northern_mariana_isla"
"us_territory_claim_puerto_rico_and_virgin_islands"

After moving those, there are only five fields left, vacols_id, vbms_id, dispatched_to_station, appeal_series_id, and issues_pulled.

More on the other columns:

It seems doable to fork the model layer using Rails polymorphism (shout out to @aroltsch, who mentioned this in the meeting itself but who I/possibly others didn't fully understand)— but maintain only one appeals table. I think this would be a lot simpler than any approach mentioned before it in the long run, and it would accomplish all three of our goals of ensuring actions aren't taken on the wrong appeal, not breaking existing routes, and prioritizing a future when legacy appeals cease to exist.

shanear commented 6 years ago

A huge benefit to using the same appeals table is to preserve current Caseflow Database relations. For example, there are a lot of tables that reference appeal_id.

I suppose this is a small benefit in so far as it reduces association complexity, but we've accounted for this in our migration plan. We can use a polymorphic association which is widely used within caseflow already and abstracts away the complexity.

There are benefits to using polymorphic associations for records that can belong to a legacy appeal or ama appeal. You don't have to load up the appeal to know whether the record is dealing with a legacy appeal or ama appeal. You also can tell out right whether a record only deals with ama appeals, or can deal with either.

Alex and I were just discussing the type field.

I think we'd probably want to use the type field, since it plays nicely with Rails Single Table Inheritance, which we'd be using for this sort of approach (we use this for Intake too).

We don't anticipate adding too many columns directly to the Appeals table

Very much disagree here. It's my understanding that for AMA Appeals, Caseflow will be storing a lot of the data that VACOLS used to store. There will be a lot of AMA related columns added to this table, none of which will be used by legacy appeals.

In addition to that, there will likely be more objects (hearings, issues, appellant, etc) where the AMA record will be in Caseflow/BGS and the legacy record will be in VACOLS.

We will move special issues columns into its own separate table special_issues

This adds a lot of difficulty to the migration, because were creating an entirely new class and association, instead of just renaming a class. It also forces us rewrite significant parts of the dispatch back end. I don't think anyone wants that.

I think I'm still leaning towards the plan for them to be two separate tables. The Sierra team is willing to take on the work to do this.

evankroske commented 6 years ago

@amprokop @mdbenjam @aroltsch, I don't understand the benefit of the new proposal. I don't think that using a single table for both types of appeals would make life easier for the Office of Performance Analysis and Integrity or for the Caseflow team. I think that keeping the tables separate would keep our code simple and make analysis easy for PA&I. Could you help me understand the benefits of the new proposal?

oscarAtNava commented 6 years ago

It is not a good practice to have a table with 50% of the columns null based on what type of info is inserted. I think Shane's polymorphic association suggestion is spot-on. I think what will help come to an optimal DB design is to exercise all known query/update paths against each proposed design. I have not seen a single picture yet, and it is hard to just talk about models without some examples. My two cents...

shanear commented 6 years ago

Additionally, there would be many of the associated tables that may not even need a polymorphic association, because they are only relevant for legacy appeals. I looked through all of the tables that currently have an association to appeal.

Reader (claims_folder_searches, appeal_views) - really easy to convert Appeals Status API (appeal_series) - not relevant for AMA Dispatch (tasks) - most likely not relevant for AMA Hearings (hearings, hearing_appeal_stream_snapshots, worksheet_issues) - Hopelessly intertwined with VACOLS. This will be hard to port to AMA either way.

sansb commented 6 years ago

Tried to summarize the things I'm seeing here:

Problem statement

We are trying to decide how to model legacy and “new” appeals. Caseflow needs to both integrate with legacy appeals management systems (VACOLS) as well as provide a self-sufficient appeals management system to fulfill the requirements of the AMA.

Current state

We currently only model legacy appeals. We do this in storing a vacols_id that points to the appeal in VACOLS, a vbms_id that points to the appeal in VBMS, and a set of boolean values called “special_issues”. We use a concern called AssociatedVacolsModel in order to expose all of the VACOLS fields we need as attributes on our Appeal. In general, Caseflow sits at a cross-roads of “legacy” and “new” objects. Caseflow models legacy hearings, appeals, etc, and will ultimately have to model its own “new” versions of those things as those legacy systems are subsumed. Open question: How are other objects with legacy/new divide being modeled currently?

Cashflow-Intake is currently blocked on implementing intakes for Board appeals by this decision because those intakes will generate “new” appeals records. There is scheduled to be a pilot for creating those appeals with Caseflow in July, so there is some time sensitivity around this decision.

This proposal is not about

  1. Unique identifiers for use in routes: This will be a concern, but at this very moment we are not trying to figure out how to do this. Either of the proposed solutions present us with options for how to uniquely identify appeals across the legacy-new boundary for use in routes, and we should make another, separate decision for how to do that. It is reasonable to argue that one solution has preferable options.
  2. Clients (JSON representations, frontend modeling): Because both proposed solutions present options for polymorphic representation of appeals, and Rails provides means for intermingling polymorphic objects in JSON response, AND React-Redux only cares about JSON responses, we should leave discussion of how to organize our clients for later.

Constraints

  1. A good solution should minimize potential for development of “new” appeals to adversely affect any of our current functionality around legacy appeals.
  2. A good solution should maximize ease of eventually removing all traces of legacy appeals from Caseflow.

Proposed Solution 1 (Two tables with polymorphic association)

Appeals would be modeled as two separate tables: appeals and legacy_appeals. This means we would then have two ActiveRecord Models: Appeals and LegacyAppeals. Other ActiveRecord Models that required an association with either of these appeal models would use polymorphic association. Here’s an example of what that would look like:

class Appeal < ApplicationRecord
  has_many :appeal_views, as: :appeal, dependent: destroy
end

class LegacyAppeal < ApplicationRecord
  has_many :appeal_views, as: :appeal, dependent: destroy
end

class AppealView < ApplicationRecord
  belongs_to :appeal, polymorphic: true

  appeal_id = 123
  appeal_type = "Appeal" or "LegacyAppeal" 
end

 a = LegacyAppeal.find(203)
a.appeal_views.create!
# now there would exist an AppealView with appeal_id == 203 and appeal_type == LegacyAppeal

This could be done via a downtime migration or with no downtime via Shane’s plan.

Benefits of this solution:

Downsides of this solution:

Proposed Solution 2 (Single table inheritance)

Appeals would be modeled as a single table. Here’s an example of what this would look like:

# schema.rb
create_table "appeals" do |t|
  ...
  t.string "type"
  ...
end

# models/appeal_base.rb
class AppealBase < ApplicationRecord
  # Any shared methods
end

# models/appeal.rb
class Appeal < AppealBase
  # AMA Appeal-specific implementation
end

# models/legacy_appeal.rb
class LegacyAppeal < AppealBase
  # Legacy Appeal-specific implementation
end

“Type” would be either Legacy or Ama.

Benefits of this solution:

Downsides of this solution

Conclusion

This question reduces to a fairly well-trod one in Rails app design. Typically one should choose STI when two classes are essentially the same but with some variations at their leaves, and one should choose polymorphism when you have two classes that one expects to share some associations. So regardless of the various benefits/downsides described above, I believe choosing the best solution here requires us to think about the problem domain. How similar are Appeals and Legacy Appeals? To what extent do they share responsibilities and behaviors? My current best understanding is that Appeals and LegacyAppeals are different domains, and so I think that using polymorphism is the better solution.

amprokop commented 6 years ago

Closing — we discussed this extensively and merged a solution into master.