Closed amprokop closed 6 years ago
I think there are a lot of good things with this API design. One of the central problems I dealt with when designing the first one is a tension between REST and revealing implementation details. As a consumer of the API, I don't really want to have to think about starting remote jobs, or the order in which I call the API. Ideally, I want to be able to hit one endpoint for the list of documents and another to download the documents. Of course this has lead us to do things in a non-RESTful manner by modifying the DB on GET requests. Our justification of this approach was that since it's just caching files, it's not really THAT un-RESTful, the route is still idempotent.
With this new approach, which I think has a lot of merits, we put more burden on consumers of the API to understand how to call it. And calling the API becomes more verbose. Now that might be reasonable tradeoff, but I think we should evaluate it as such.
Some specific thoughts on the routes.
For GET /api/v2/documents_list (?force_sync=true|false) -H “File Number: XXXXXX”
do we really need the force_sync
parameter? I thought you wanted to get away from doing things synchronously. Also since this is a GET that launches jobs, it's not RESTful
For POST /api/v2/fetch_documents -H “File Number: XXXXXX”
I think there should be a flag for whether to start the zip file job. This is an expensive thing to run, and something we don't always need.
For GET /api/v2/fetch_documents_status/:id
could this be named GET /api/v2/fetch_documents/:id
? Do we need another path specifically for status? Not sure about the standards here on polling endpoints.
For PUT /api/v2/documents/:id -H “File Number: XXXXXX”
We don't need to pass the file number. The doc id is sufficient. We should also get the doc from VBMS even if it's not already cached in S3.
PUT /api/v2/documents/:id -H “File Number: XXXXXX”
is more of a long term goal. We'll probably need to migrate to v2 of the VBMS api before we can modify things in VBMS. But still will be useful then.
Thanks for spec-ing this out @amprokop! This feels like a solid refactor and proposal.
I'd be curious to learn more about the consumers of this API and their requirements.
So far we have today:
I'd recommend consulting with @Chingujo on future plans / iteration of the UI to understand how we see eFolder evolving and if that brings considerations to shape the direction of the v2 API.
Some specific thoughts more on the plan:
I'm always hesitant for refactors of existing and heavily relied upon services. I'd imagine a key challenge here might be to refactor models to accomplish a cleaner API feel and codebase while ensuring no regressions to v1. We can perhaps discuss the merits of a refactor within the existing codebase or taking key pieces of caseflow-efolder and starting a new API service to accomplish this.
This is probably obvious, but if not... Once we have an API design pattern that works well for us, I'd recommend breaking this down into a implementation plan with deployable and validating phases.
Thanks @sunil-sadasivan! and the API looks exciting in making eFolder express flexible and more integratabtle across Caseflow. Currently specing out a long term vision for eFolder express. In terms of shorter term UI changes, I don't see drastic changes unless in response to production bugs, simple UX improvements, and content changes (amid some open questions still left to be answered).
Planning on a sync with @mdbenjam on Monday to see how all this fits in to ex's future =D
@amprokop Thank you for writing this up. Some thoughts and questions on the endpoints:
GET /api/v2/documents_list (?force_sync=true|false) -H “File Number: XXXXXX”
force_sync
flag, however, we will be exposing implementation details to the api consumer. What can we do to avoid the flag? Maybe we can have two different endpoints?GET /api/v2/documents_list
to GET /api/v2/documents
POST /api/v2/fetch_documents -H “File Number: XXXXXX”
fetch
everywhere. I’d like to suggest to use simpler names. For example, POST /api/v2/documents
. GET /api/v2/fetch_documents_status/:id
GET /api/v2/documents/:id -H “File Number: XXXXXX”
I agree with Mark we can omit a file number here.
GET /api/v2/documents -H “File Number: XXXXXX”
It can have the same name as #1, it will just respond to a different format.
respond_to do |format|
format.zip do
…
end
end
Kudos @amprokop, this document looks great!
One question, could we have the entire API be asynchronous (i.e. each API request returns an ID and the client uses that ID to poll some other endpoint(s) to see if the request is ready to be returned)? I don't know how well this jibes with the RESTful way of doing things, but it seems like having the entire API be asynchronous so that we can avoid exposing implementation details to the client and to more gracefully account for some API calls taking longer sometimes (like when efolder isn't able to use its cache).
Context
V1 of the eFolder API was somewhat heroically developed by @aroltsch and @mdbenjam. We got it out the door, and it works for Reader. But eFolder was originally developed from front to back as a "Download All" button for VBMS. We've since grafted a few things onto it: the ability to download separate documents instead of a ZIP file, the ability to fetch documents from VVA as well as VBMS, and an API.
There are several areas of the API we could do better with:
Overview
We have been considering a refactor of eFolder Internals, documented here and here. We should couple those goals to development of v2 of the eFolder API, so we can make breaking changes without fear and roll out incrementally behind a simple feature flag.
Goals
eFolder v2 API routes
Returns a JSON list of documents associated with a file number.
More details
force_sync
parameter set to true:force_sync
parameter set to true:Use cases
force_sync
parameterStarts fetching and saving all the documents from our document sources. Returns an ID of the fetch attempt so the caller can check on its status later.
More details
Use cases
Returns the status of our attempt to fetch documents from document sources, save them to S3, and make them available via our API.
More details
Use cases
Returns a document file.
More details
Updates some parameters of a document.
More details
TBD—whatever parameters we would like to update. Parameters we can update in VBMS can be updated in VBMS. Annotations? Comments? Tags? Document type?
Use cases
Returns ZIP of all documents.
More details
Use cases
Associated Work
https://github.com/department-of-veterans-affairs/caseflow-efolder/issues/670
Open Questions