How should we handle unique identifiers for AMA and legacy appeals?

amprokop commented 6 years ago

Context

We'll soon be creating the AMA data model, and we'll need a way to distinguish between AMA and legacy appeals. Currently, we use the vacols_id in lots of routes (e.g. Reader users navigate to a claims folder using /reader/appeal/:vacols_id). Before we introduce a new Appeal table, we'll want to decide on an approach to handle appeal lookups across tables and when using parameters in our HTTP routes.

Option 1: Join table for both AMA and legacy appeals with a unique ID

We'd create an AppealIds table that is essentially a join table to give us a unique id across brieff.bfkey and the AMA appeals table. We'd resolve to only use this join table to look up appeals.

Pro

We'd have unique IDs managed by Caseflow in our DB for both legacy and AMA appeals.

Con

We'd have to refactor all of our existing routes to use that id. We'd have to be disciplined and enforce only doing appeal lookups using our join table, and reads/writes would be slower since the join table would have to be maintained, and every appeal lookup would necessitate a join.

Option 2: Set an offset between the current `brieff.bfkey` and the first AMA appeal id to avoid collisions

There is a limited timeframe during which legacy appeals can be created and thus brieff.bfkey values assigned. We would pick a number large enough that it would never be assigned as a brieff.bfkey, and count up from there for AMA appeals. Confirmed this approach with Jed: (https://github.com/department-of-veterans-affairs/dsva-vacols/issues/1)

The SQL statement would look something like: ALTER SEQUENCE appeal_id RESTART WITH 7000000;

Pro

It would preserve backwards compatibility of all of our routes, and is potentially the least complex/most forgiving option. For instance, our LegacyAppeal class could recognize if you're querying for an id that is higher than the highest legacy appeal id, and throw an error/return an AMA appeal instead.

Also, we leave open the opportunity to teach VA employees how to distinguish between types of appeals by ID.

Con

Perhaps not the most elegant approach. If we were to choose the wrong AMA starting id and legacy ids/ama ids collided, it would be extremely difficult to unwind.

Option 3: Add "A" or "L" to the end of ids e.g. `343242L`or `34224A`

This is essentially the approach VACOLS uses to distinguish between VBMS ids that are SSNs and C numbers.

Pro

Type of appeal is transparent from every URL.

Con

We'd have to refactor all existing routes. This creates a new source of bugs i.e. VBMS/VACOLS veteran id translation — we'll have to remember to strip the extra character/append it every time we're running queries or displaying ids. In the VBMS id context, this has caused a fair amount of confusion/bugs.

Option 4: Don't have unique ids, instead in all routes require an extra query param e.g. `?appeal_type=AMA`

Pro

Type of appeal is transparent from every URL.

Con

Creates a new source of bugs. Makes every route with an appeal id less readable/more complex. We'll have to add query param logic everywhere. Related bugs could be really bad, as we'd be operating on the wrong appeal.

cmgiven commented 6 years ago

I think I lean toward option 1. I don't think 2 is necessarily bad, although I can imagine disaster scenarios that could result in much higher numbers being necessary than we would naively expect (CAVC throwing out RAMP, e.g.).

Mostly chiming in on option 3, though, to note that BFKEYs can already sometimes be appended with the following characters: B, D, L, N, P, R, W, X, Z. Suggest using only characters for this disambiguation that are not presently used as disposition codes (look at VACOLS::Case.DISPOSITIONS).

evankroske commented 6 years ago

Option 3.1: Add "AMA" to the end of AMA IDs

We would update our existing routes to distinguish between legacy and AMA IDs and query the corresponding DB.

Pro

Simple.
Very easy to implement.
Could be adopted incrementally, route by route.
Preserves existing Caseflow URLs. If a user bookmarked /appeal/1234, that bookmark would still be valid without any extra work.
No additional DB queries.
Type of appeal is apparent from URL.

Con

Makes our IDs ugly?

anyakhvost commented 6 years ago

I am in preference of option 4. To start with we can only require to pass an appeal_type if is an AMA appeal. If the appeal_type is missing, we will default to the Legacy appeal, this way we don't have to re-factor existing routes.

I think we need to distinguish between FE URLs (whatever a user is seeing) and BE routes.

BE

I only see vacols_id in certifications and queue routes when I run:

bundle exec rake routes | grep vacols_id

Can we re-factor those to use appeal_id?

FE

Not super familiar with FE, but can we re-factor those to use appeal_id instead of vacols_id?

mdbenjam commented 6 years ago

I like @evankroske's approach. It seems the easiest as there is no refactoring required. It also is very clear in the URL what is going on.

As for the other options: 1) A join table adds a decent amount of complexity. We'll need to do some refactoring to make it all work. And, who wants more joins??? 2) I'm okay with this, but feels magic number-y to me. 4) I think a query parameter is super inelegant. All the information is no longer included in the appeal id. It's in two separate places: appeal id and query param. All of our controllers that deal with appeals will now need extra code to look for the query param.

We could also create a BaseAppeal model that Legacy and AMA appeals could inherit from. This base model could have a create_appeal method that determines which of the two to initialize based on the presence of 'AMA' in the appeal_id. They both can implement the set of methods they need for existing applications like Reader and Queue to work, but it should smooth the transition and make this change transparent to applications.

@aroltsch reader definitely uses vacols_id, it calls it appeal_id 🤷‍♂️ . Not sure if there are other places as well, but grep-ing for vacols_id doesn't guarantee finding them all.

amprokop commented 6 years ago

Leaving a few considerations to try to frame this conversation:

Minimize likelihood that an action is taken on the wrong appeal (super important)

Nightmare scenario: Let's say a user has to close an appeals. Due to a bug in our code, or a confusing ID scheme, they close the wrong one. The Veteran never gets notified of this, but waits for years before contacting VA. This is a really bad outcome. We should minimize the chance that this ever happens.

Prioritize a future when legacy appeals gracefully cease to exist

In just a few years, there will be many more AMA appeals than legacy appeals. It'd be preferable to ensure that AMA naming / lookups are no less simple and clear than legacy appeals.

Avoid breaking existing links/routes

Two reasons, neither critical— it will be substantial to refactor all our routes, but I don't have a strong sense that users rely on bookmarks a whole lot. The one exception is the Interactive Decision Template, which is now in use and links straight to Reader.

shanear commented 6 years ago

So options 2 and 3.1 all involve leaving legacy appeals as is, and creating a new set of identifiers for AMA Appeals that are:

ideally have a pattern that's distinguishable from VACOLS ids
guaranteed not to collide with legacy appeal VACOLS ids

Any of these options would satisfy these three concerns:

Minimize likelihood that an action is taken on the wrong appeal (super important) Prioritize a future when legacy appeals gracefully cease to exist Avoid breaking existing links/routes

So in light of this, the debate between these options is between the format of the AMA Appeal identifier:

2) A magic starting number that's larger than the count of legacy appeals we expect to have (7 million?) This one is problematic for a couple reasons: 1) I don't love that we can't easily distiguish AMA ids from legacy ids, 2) While unlikely, if the legacy appeal count were to somehow eclipse our chosen number, it would be a catastrophic Y2K style problem for us.

3.1) Starting with 1 and appending 'AMA' to the end of ids I don't see any significant downsides to this option. I think it suits our needs and is relatively simple.

2.1) (new!) UUIDs UUIDs are a widely supported concept and industry standard for identifiers. It allows for the creation of appeals in a distributed way (should we want to do that).

So like 3.1 it satisfies all our needs, while leaves us with more options with regard to smoother distributed creation of appeals. The fact that the UUID is not easily written on a sticky note might also be a feature not a bug in the sense that it would incentivize users to share appeals via links instead of with sticky notes, which is error prone no matter what kind of ID you use.

anyakhvost commented 6 years ago

@amprokop, @mdbenjam and I thought of a new approach.

Database changes

Appeals table will be used for both AMA and Legacy appeals, and will keep its current name.
We will move the special issues columns (into its own separate table special_issues
We will introduce a new column to the appeals table: type (Legacy, Ama)
Column vacols_id will stay in the appeals table and will only be used by the Legacy appeals to query VACOLS::Case table. We will remove the non-null constraint.
We will add additional Ama columns to the appeals table.

Model changes:

Appeal.rb will be the new AMA appeal model
LegacyAppeal.rb will be the old appeal model

Notes

A huge benefit to using the same appeals table is to preserve current Caseflow Database relations. For example, there are a lot of tables that reference appeal_id.
Both new Appeal.rb and old LegacyAppeal.rb will use the same table.
Reader: Mark has a solution on how to migrate Reader urls to using appeal_id instead of vacols_id
Certification: Alex thinks certification can continue using vacols_id because certification will not used by AMA
Queue: we will slowly migrate Queue routes to use appeal_id
We don't anticipate adding too many columns directly to the Appeals table

mdbenjam commented 6 years ago

Alex and I were just discussing the type field. We think this should just be docket_type with options like (Legacy | Hearing Docket | Evidence Docket | No Evidence Docket).

Another thing: https://www.postgresql.org/docs/9.0/static/indexes-unique.html. We can leave the uniqueness constraint, but remove the not-null constraint.

How to migrate Reader URLs: 1) Create a new URL: reader/appeal_id/:appeal_id which uses the appeal_id instead of the vacols_id. 2) Redirect the current route: reader/appeal/:vacols_id to the new route reader/appeal_id/:appeal_id

amprokop commented 6 years ago

@shanear @oscarAtNava @evankroske @sharonwarner @sunil-sadasivan , when convenient would you mind taking a look at the approach Anya detailed?

In our discussion, I/we overlooked the fact that the majority of the complexity/legacy logic is exclusively in the model layer. The appeals db table wouldn't be complex at all, if we move the long list of special issues columns:

"rice_compliance"
"private_attorney_or_agent"
"waiver_of_overpayment"
"pension_united_states"
"vamc"
"incarcerated_veterans"
"dic_death_or_accrued_benefits_united_states"
"vocational_rehab"
"foreign_claim_compensation_claims_dual_claims_appeals"
"manlincon_compliance"
"hearing_including_travel_board_video_conference"
"home_loan_guaranty"
"insurance"
"national_cemetery_administration"
"spina_bifida"
"radiation"
"nonrating_issue"
"us_territory_claim_philippines"
"contaminated_water_at_camp_lejeune"
"mustard_gas"
"education_gi_bill_dependents_educational_assistance_scholars"
"foreign_pension_dic_all_other_foreign_countries"
"foreign_pension_dic_mexico_central_and_south_america_caribb"
"us_territory_claim_american_samoa_guam_northern_mariana_isla"
"us_territory_claim_puerto_rico_and_virgin_islands"

After moving those, there are only five fields left, vacols_id, vbms_id, dispatched_to_station, appeal_series_id, and issues_pulled.

Problem statement

We are trying to decide how to model legacy and “new” appeals. Caseflow needs to both integrate with legacy appeals management systems (VACOLS) as well as provide a self-sufficient appeals management system to fulfill the requirements of the AMA.

Current state

We currently only model legacy appeals. We do this in storing a vacols_id that points to the appeal in VACOLS, a vbms_id that points to the appeal in VBMS, and a set of boolean values called “special_issues”. We use a concern called AssociatedVacolsModel in order to expose all of the VACOLS fields we need as attributes on our Appeal. In general, Caseflow sits at a cross-roads of “legacy” and “new” objects. Caseflow models legacy hearings, appeals, etc, and will ultimately have to model its own “new” versions of those things as those legacy systems are subsumed. Open question: How are other objects with legacy/new divide being modeled currently?

Cashflow-Intake is currently blocked on implementing intakes for Board appeals by this decision because those intakes will generate “new” appeals records. There is scheduled to be a pilot for creating those appeals with Caseflow in July, so there is some time sensitivity around this decision.

This proposal is not about

Unique identifiers for use in routes: This will be a concern, but at this very moment we are not trying to figure out how to do this. Either of the proposed solutions present us with options for how to uniquely identify appeals across the legacy-new boundary for use in routes, and we should make another, separate decision for how to do that. It is reasonable to argue that one solution has preferable options.
Clients (JSON representations, frontend modeling): Because both proposed solutions present options for polymorphic representation of appeals, and Rails provides means for intermingling polymorphic objects in JSON response, AND React-Redux only cares about JSON responses, we should leave discussion of how to organize our clients for later.

Constraints

A good solution should minimize potential for development of “new” appeals to adversely affect any of our current functionality around legacy appeals.
A good solution should maximize ease of eventually removing all traces of legacy appeals from Caseflow.

Proposed Solution 1 (Two tables with polymorphic association)

Appeals would be modeled as two separate tables: appeals and legacy_appeals. This means we would then have two ActiveRecord Models: Appeals and LegacyAppeals. Other ActiveRecord Models that required an association with either of these appeal models would use polymorphic association. Here’s an example of what that would look like:

class Appeal < ApplicationRecord
  has_many :appeal_views, as: :appeal, dependent: destroy
end

class LegacyAppeal < ApplicationRecord
  has_many :appeal_views, as: :appeal, dependent: destroy
end

class AppealView < ApplicationRecord
  belongs_to :appeal, polymorphic: true

  appeal_id = 123
  appeal_type = "Appeal" or "LegacyAppeal" 
end

 a = LegacyAppeal.find(203)
a.appeal_views.create!
# now there would exist an AppealView with appeal_id == 203 and appeal_type == LegacyAppeal

This could be done via a downtime migration or with no downtime via Shane’s plan.

Benefits of this solution:

After migration, new appeals are developed entirely indecently of legacy appeals. Less velocity drag from maintaining current feature set, less risk of introducing bugs into legacy appeals flows.
Fairly easy to remove legacy appeals when the time comes- LegacyAppeals.all.destroy, DROP TABLE legacy_appeals, rm app/models/legacy_appeals.rb

Downsides of this solution:

Requires a migration
Requires maintaining polymorphic interfaces where those are necessary (common implementation can be pushed to a base class, but tests should still cover both types)

Proposed Solution 2 (Single table inheritance)

Appeals would be modeled as a single table. Here’s an example of what this would look like:

# schema.rb
create_table "appeals" do |t|
  ...
  t.string "type"
  ...
end

# models/appeal_base.rb
class AppealBase < ApplicationRecord
  # Any shared methods
end

# models/appeal.rb
class Appeal < AppealBase
  # AMA Appeal-specific implementation
end

# models/legacy_appeal.rb
class LegacyAppeal < AppealBase
  # Legacy Appeal-specific implementation
end

“Type” would be either Legacy or Ama.

Benefits of this solution:

No migration necessary
Everything that works right now would continue to work as is
Unique identifiers for use in routes, etc are “free”

Downsides of this solution

There will be an unknown, possibly large number of null fields on each class.
- “Special issues” can be moved into a dependent object, but then there is a migration for this solution, which makes the migration downside of the other solution somewhat moot.

Conclusion

This question reduces to a fairly well-trod one in Rails app design. Typically one should choose STI when two classes are essentially the same but with some variations at their leaves, and one should choose polymorphism when you have two classes that one expects to share some associations. So regardless of the various benefits/downsides described above, I believe choosing the best solution here requires us to think about the problem domain. How similar are Appeals and Legacy Appeals? To what extent do they share responsibilities and behaviors? My current best understanding is that Appeals and LegacyAppeals are different domains, and so I think that using polymorphism is the better solution.

amprokop commented 6 years ago

Closing — we discussed this extensively and merged a solution into master.

department-of-veterans-affairs / caseflow