[Tech spec] Shared pagination for queue table views

lowellrex commented 5 years ago

Caseflow builds queue table views in a variety of different ways, but all of those queue table views share several core components.

Problem statement

Currently we define all of these components in the front-end code and request tasks from the back-end to fill the body of the table. However, some tables include thousands of tasks (VLJ support staff, some VSOs) and take several minutes to load. We believe that reducing the number of tasks that the back-end attempts to retrieve and return to the front-end will reduce those load times. This tech spec exists to explore ways to accomplish that goal.

How is pagination currently implemented?

Introduced in #9241, pagination in queue table views happens entirely on the front-end by accepting the entire set of tasks and limiting the set of displayed tasks to 15. This approach requires the front-end to have access to every single task in the entire set even though we only want to display a subset of those tasks (the single page).

Why is shared pagination hard?

Filtering and sorting (and changing state*).

If we wanted the 3rd page of tasks, a naive implementation of shared pagination might simply have the back-end return the 15 tasks starting at the 31st task and have the front-end display them as-is. However, if we wanted to apply a filter to show only tasks related to AMA appeals on the direct review docket we would want that filter to apply to all tasks and not just the 15 tasks on the current page. We would also want to paginate across that filtered set if it contained more than 15 tasks. Similar problems occur when we want to sort. The back-end needs to know that the front-end would like to filter (or sort) on a particular field and the front-end needs to know what set of tasks the back-end is returning.

*I think we should avoid the additional difficulty caused by tasks changing state resulting in them being returned by different filters or sorted in a different order by simply replacing the front-end's state whenever we do anything that modifies tasks (which is probably anything other than pure navigation around the application).

So how can we do shared pagination?

The front-end and back-end can share information about applied filters, sorted columns, current page, etc. We can use the URL to convey the information the front-end requests (and to enable bookmarking and more precise navigation) and include the that information in the payloads the back-end sends to the front-end. Using the image above as an example we can sketch out how this entire page load looks.

URL: The front-end will use the query string of the url to make requests to the back-end. For example, http://appeals.cf.ds.va.gov/queue?tab=on_hold&page=1&sort_by=case_details_link&order=desc&filter[]=col%3Ddocket_type%26val%3Dlegacy&filter[]=col%3Dtask_action%26val%3Dtranslation. We specify which tab we are requesting tasks for, which page, which column we are sorting by, which direction we are sorting, as well as the two columns we are filtering on. Any (or all) of these parameters can be omitted and the back-end would fall back to using defaults.

Back-end: The back-end will translate those parameters into a database request. I think it makes sense to start by doing this using ActiveRecord and only move it into a direct SQL query if the ActiveRecord request cannot be made efficient enough. We should take advantage of our queue classes to build those statements (and do validation, input sanitization, access control, etc.) but the ultimate ActiveRecord statement could end up looking something like:

Task.where(assigned_to: current_user)
  .where(status: "on_hold") # for the tab
  .where(appeal_type: "LegacyAppeal") # First filter
  .where(action: "translation") # Second filter
  .sort_by { |t| t.appeal.veteran_full_name } # Sorting
  .reverse # For sort order
  .page(1) # From something like the kaminari or will_paginate gems

The above sketch can certainly be optimized, but I think it is a decent first pass of what the back-end will have to do to implement pagination. Additionally, we will probably need to store the method chain before the filtering and before the sorting as their own variables so we can run .count() on those collection to get the count of all tasks that exist for the tab and for the page (if the filtered set spill over onto multiple pages).

From this ActiveRecord statement we can build the back-end's response to the front-end:

tab_tasks = Task.where(assigned_to: current_user).where(status: "on_hold")
filtered_tasks = tab_tasks.where(appeal_type: "LegacyAppeal").where(action: "translation")
page_of_tasks = filtered_tasks.sort_by { |t| t.appeal.veteran_full_name }.reverse.page(1)

response = {
  table_title: "Your cases",
  active_tab: "on_hold",
  tabs: [
    {...},
    {
      # Specifications about the table displayed in this tab.
      name: "On hold",
      description: "Cases on hold (will return to...",
      columns: ["case_details_link", "task_action", "appeal_type", ...],

      # Information about the tasks requested.
      total_tasks_count: tab_tasks.count,
      filtered_tasks_count: filtered_tasks.count,
      tasks: page_of_tasks.serialized
    },
    {...}
  ]
}

Front-end: The front-end will receive the above payload, and dynamically draw the tabs and table based on the configuration specified by the payload, populating the table with tasks included in that same payload. The shape of the front-end state would look very similar to the payload except that tasks would be organized by page number:

state = {
  ...,
  tabs: [
    {...},
    {
      ...
      filtered_tasks_count: 2,
      pages: {
        3: [ ... ]
      },
      ...
    },
    {...}
  ]
};

We could store the information that lives in the URL in the front-end state as well if we want to, or we could just use the URL as the store of that information so we don't have to maintain two places for that data to live.

How do we get there from here?

Since organization's queues are the ones most negatively impacted by the lack of back-end pagination we can focus efforts on there first before expanding this strategy to all queues:

[x] Pass table configurations to front-end from back-end (#10049)
[x] Use back-end configuration to generate components on the front-end (#11045)
- Continue drawing tasks from the global Redux store as we do now.
[x] Pass task-related information in back-end's response in addition to configuration information (#11049)
- Use tasks from this response instead of Redux store
- Don't implement back-end paging yet, continue to rely on front-end
[x] Allow back-end to accept sorting and filtering (#11052)
- [x] Enable back-end sorting of tasks by docket type and number (#11311)
- [x] Enable back-end sorting of tasks by Veteran last name (#11312)
- [x] Enable back-end sorting of tasks by regional office (#11313)
- [x] Enable back-end sorting of tasks by issue count (#11314)
- [x] Enable back-end sorting of tasks by appeal type (#11315)
- [x] Enable back-end sorting of tasks by assignee name (#11316)
- [x] Enable back-end sorting of tasks by number of days on hold (#11317)
- [ ] Enable back-end sorting of tasks by assigner (#11318)
[x] Allow back-end to accept filtering (#11307)
[x] Implement paging on back-end (#11053)
[x] Enable TaskTable component to use API requests to paginate and sort tasks within the table (#11054)
[x] Include filterable values in queue config response to front-end (#11509)
[x] Enable TaskTable component to use API requests to filter tasks within the table (#11309)
[x] Cache responses from queue pagination API (#11310)
[x] Allow direct navigation to filtered page of task table results using URL parameters (#11055)
[ ] Navigating to queue table view retains correct view (#9119)
[x] Consider returning a reduced set of columns for each appeal by using a different serializer (only those fields that will be displayed as columns in the queue table view) since we will re-fetch the appeal's tasks when we load the case details page anyway. (#11334)

Unsolved problems

How do we filter and sort the combination of legacy VACOLS tasks and new Caseflow tasks? We only run into this problem for attorney and judge queues but we may have to load the entire set of legacy tasks into memory before we do this. That might not be terrible since we expect judge and attorney queues to contain relatively few tasks (fewer than 100).
If we are on page 3 of the "on hold" tab and click into the "assigned" tab then back to the "on hold" tab should we still be on page 3? If so, we will have to store that information in the front-end state and update the URL from the state.

Future improvements

Using this approach we can anticipate that the user will continue to click through pages and eager load the next page's tasks when we navigate to a new page (Reader has a similar behaviour).

anyakhvost commented 5 years ago

Great tech spec! A few questions/comments:

Why do we allow two filters?
I think we should not implement pagination for queues that contain legacy tasks because we should be moving away from those.
Can we hide it behind a feature flag and only deploy to a few users to start with?

lowellrex commented 5 years ago

Why do we allow two filters?

I don't know. This tech spec seeks parity to avoid regressions so I didn't explore any alternatives to how we implement filters. I'm open to the idea though if that's a direction we want to go.

I think we should not implement pagination for queues that contain legacy tasks because we should be moving away from those.

Agreed!

Can we hide it behind a feature flag and only deploy to a few users to start with?

Absolutely. The queues that needs back-end pagination most desperately are organization queues (VSOs and VLJ support staff specifically), so I think it makes sense to start there. Starting with organization queues has the added benefit that they do not contain legacy tasks so we can delay dealing with that problem until we implement this for judge and attorney queues (when hopefully we will have fully transitioned to Caseflow tasks).

kevmo commented 5 years ago

Good tech spec & follow-up questions. I agree with the granular introduction of the feature (feature flags & doing it for VLJ staff and VSOs first).

Re queston 2 of Anya's -- do most queues contain legacy tasks (which would bar users from getting benefit of pagination if we won't implement for queues that contain them)?

lowellrex commented 5 years ago

Re queston 2 of Anya's -- do most queues contain legacy tasks (which would bar users from getting benefit of pagination if we won't implement for queues that contain them)?

Attorneys and Judges are the only folks whose queues contain legacy tasks. We expect these queues to have very few cases so they should load relatively quickly and would not benefit from pagination much as a result. After we have deprecated DAS (and Caseflow tasks have replaced legacy VACOLS tasks) we will be able to give attorneys and judges pagination without having to ever worry about legacy tasks. I'm happy to change course as we get further into this effort (if we determine that attorneys and judges need pagination before DAS deprecation, for instance), but for now I think it makes sense to delay delivering pagination for attorneys and judges.

Using a single judge team as an example we can see that these attorneys have an average of roughly a dozen tasks and the judge has 50 or so cases.

rails c> example_judge_team = JudgeTeam.third
rails c> example_judge_team.attorneys.count
# 9
rails c> example_judge_team.attorneys.map { |atty| AttorneyQueue.new(user: atty).tasks.count + LegacyWorkQueue.tasks_for_user(atty).count }
# [8, 0, 2, 11, 16, 11, 21, 4, 4]

rails c> judge = example_judge_team.judge
rails c> GenericQueue.new(user: judge).tasks.count + LegacyWorkQueue.tasks_for_user(judge).count
# 58

Whereas the VLJ support staff and American Legion queues have thousands of cases.

rails c> Colocated.singleton.tasks.count
# 3696
rails c> Vso.find_by(name: "American Legion").tasks.count
# 1686

lpciferri commented 5 years ago

This is done! Closing.

department-of-veterans-affairs / caseflow