Implementation plan to support fhir bulk upload on fhir-web.

peterMuriuki commented 10 months ago

Refs:

Objective

Allow opensrp-2 web users to make an upload of post deployment data via the web interface. This currently exists as a standalone python script the importer. This issue is an RFC on how the back-end will be implemented, I have presented 3 options below for comment.

1. Add a restful api facade to the script

Add an api wrapper to the script. Means we can either use the importer script via the CLI or through a restful api service. We can bundle the service as a docker image which we can use as the base when building fhir-web's image.

The service will expose apis that allow web to:

Fetch sample templates
Upload populated csv templates.
Report on the exit status of the operation
Provide summary of exit status (e.g Errors or warning, and summary of resources uploaded if success)

Caveat: The service should expect that all the populated csv templates might be uploaded as a single request. Where such an upload takes time to complete, Web cannot block users from navigating elsewhere, so we might need provisions of checking if there are pending upload jobs and their progress status, which is information we would show on web if user came back to this view.

2. Call the importer script from express

Fhir-web has an express-server that runs on the server, we can expose an api that when hit will spawn a child process that would execute the importer script. The importer script can be bundled into fhir-web by way of a submodule

The api would provide the following:

Return sample csv templates
Accept populated csv templates.
Spawn a child_process for each request, that executes the importer script
Parse std to figure out:
- Exit status
- Summary of exit status

Parsing std might be a challenge and very brittle, since it's dependent on the script's log language and format.

3. Re-do the implementation in fhir-web.

Essentially rewrite the importer script functionality at fhir-web's express server layer. This would provide an api service that has similar requirements as to that in option 1.

Ofcourse this would mean duplicacy in maintenance efforts which is not ideal, and could possibly incur more initial effort to set this up, since we would still need to add the restful functionality that would also be required in option 1

====

In all the options some of the challenges that I think need to be solved:

How our choice of api would explain:
- A running upload process
- A successfully exited upload process
- An errorred out upload process, or one with warnings
Dealing with intentional/accidental re-uploads.
Bundling/Packaging for deployment.

====

The first option seems to be the most sane. cc @dubdabasoduba @Wambere @pld @HenryRae @joyce-x-chen

Wambere commented 9 months ago

I agree that the first option makes most sense

pld commented 9 months ago

I think these all have pros and cons, my concern with the first option is that we now need to maintain a new service. Can you add diagrams explaining the draft system design of options (1) and (2) so we can compare them with that in mind?

peterMuriuki commented 9 months ago

Option 1 option1

Option 2 option2

@pld @Wambere Let me know if this addresses the areas of interest.

my concern with the first option is that we now need to maintain a new service.

IMO. in option 1, the importer service maintenance need only concern itself with its own implementation. In option 2, maintenance will have to be concerned with work across 2 repositories.

peterMuriuki commented 9 months ago

Reviewed the workflows and updated the diagrams, see below. in option 1:

embedded means running the importer service as part of/adjacent to the express-server(would require additional api endpoints on express to proxy requests to importer.)
standalone is where we would deploy a new service for the importer api.

In option 2; Also noticed that authentication can be handled differently where with some changes to the script config, we can re-use fhir-web's user session to authenticate the importer script requests. on the other hand, we'll have this work maintained across 2 repos and parsing std out might be a pain (though this can be managed by being strict with the stdout format of the script).

Option 1

standalone	embedded

option 2 option2

pld commented 9 months ago

@dubdabasoduba what type of load, like how many request per second do we expect on the fhir-web for eusm? I'm guessing not many.

@Wambere how long does the importer take to process a request?

@peterMuriuki another design question, in these diagrams it looks like it's running syncronously in the request/response loop. If the importer takes a couple seconds, that seems like a problem, eg the request may timeout

dubdabasoduba commented 9 months ago

@pld I think OpenSRP web only receives a lot of traffic during project setup or when onboarding new users. This phase of EUSM will have fewer OpenSRP web interactions since we are doing away with missions.

Wambere commented 9 months ago

@pld haven't really tested it with large datasets, I can create an issue for that. But for the small ones that I've worked with it's pretty fast, dependent on how fast we get replies back from the api of course. Updates however would be a tad slower than initial creation because we have to check that they exist before posting, so 2 requests instead of one.

pld commented 9 months ago

@Wambere how many rows are the small datasets? How many seconds does it take to complete for both create and update?

peterMuriuki commented 9 months ago

@peterMuriuki another design question, in these diagrams it looks like it's running syncronously in the request/response loop. If the importer takes a couple seconds, that seems like a problem, eg the request may timeout

Yeah that would be a concern as well, The implementation would have to include an async job orchestration aspect to it. We can do a light weight memory-based implementation that does not require persisting. It would be alive for as long as the service is up and have no recoverability.

Sort of like a HashMap of arbitrary job-ids to the jobs. {job-ids: JobInstance}.

Wambere commented 9 months ago

@pld for 20 rows it's about 3 seconds for creation and 26 seconds for editing

The time increased when we removed the version column form the csv, so we are basically getting every single resource to get it's version, to use for the updating payload

pld commented 9 months ago

OK so definitely needs async

pld commented 9 months ago

Option 2 seems simpler to me, what am I missing on the downsides to this approach?

peterMuriuki commented 9 months ago

Option 2 seems simpler to me, what am I missing on the downsides to this approach?

Yeah, I started reconsidering my initial recommendation as well, The challenge in option 2 just had to do with parsing the script logs to generate the output, but we can mitigate this by defining what output and how to format the output so that it can be reliably parsed on fhir-web.

We can do option 2(start with a poc to see if there are any challenges that I might not have forseen).

pld commented 9 months ago

On output changing that to something more structured, like JSON would be reasonable.Can you just out some of pros and cons for options 1 and 2 please?On Feb 15, 2024, at 03:01, Peter Muriuki @.***> wrote:

Option 2 seems simpler to me, what am I missing on the downsides to this approach?

Yeah, I started reconsidering my initial recommendation as well, The challenge in option 2 just had to do with parsing the script logs to generate the output, but we can mitigate this by defining what output and how to format the output so that it can be reliably parsed on fhir-web. We can do option 2.

—Reply to this email directly, view it on GitHub, or unsubscribe.You are receiving this because you were mentioned.Message ID: @.***>

peterMuriuki commented 9 months ago

@pld Option 2	PROS	CONS
Option to authenticate importer requests using existing client authentication, A bit of a security layer that would protects access to the upload feature	Requires 2 runtimes, in a single code base
Easier to align this feature to how it will be used by the web	compatibility represents a bit of a challenge (https://github.com/onaio/fhir-tooling/issues/140)
Centrally distributed and deployed i.e from a single fhir-web image

Option 1	PROS	CONS
Separation of concerns dealianted along runtime environments	Distribution requires this to be an additional standalone service (An important quiz: Would we need to setup a new importer service instance for each fhir-web client instance?)
	Require more work to protect api from abuse.

@Wambere @dubdabasoduba you can add others not captured here, as of now, I think we can go with option 2.

pld commented 9 months ago

Cool thanks @peterMuriuki! The first con for option 1 outweighs the other in my opinion. I'd rather start this as a single service then if/when we need to scale it separately horizontally we split it out

onaio / fhir-tooling