Open peterMuriuki opened 10 months ago
I agree that the first option makes most sense
I think these all have pros and cons, my concern with the first option is that we now need to maintain a new service. Can you add diagrams explaining the draft system design of options (1) and (2) so we can compare them with that in mind?
Option 1
Option 2
@pld @Wambere Let me know if this addresses the areas of interest.
my concern with the first option is that we now need to maintain a new service.
IMO. in option 1, the importer service maintenance need only concern itself with its own implementation. In option 2, maintenance will have to be concerned with work across 2 repositories.
Reviewed the workflows and updated the diagrams, see below. in option 1:
In option 2; Also noticed that authentication can be handled differently where with some changes to the script config, we can re-use fhir-web's user session to authenticate the importer script requests. on the other hand, we'll have this work maintained across 2 repos and parsing std out might be a pain (though this can be managed by being strict with the stdout
format of the script).
Option 1
standalone | embedded |
---|---|
option 2
@dubdabasoduba what type of load, like how many request per second do we expect on the fhir-web for eusm? I'm guessing not many.
@Wambere how long does the importer take to process a request?
@peterMuriuki another design question, in these diagrams it looks like it's running syncronously in the request/response loop. If the importer takes a couple seconds, that seems like a problem, eg the request may timeout
@pld I think OpenSRP web only receives a lot of traffic during project setup or when onboarding new users. This phase of EUSM will have fewer OpenSRP web interactions since we are doing away with missions.
@pld haven't really tested it with large datasets, I can create an issue for that. But for the small ones that I've worked with it's pretty fast, dependent on how fast we get replies back from the api of course. Updates however would be a tad slower than initial creation because we have to check that they exist before posting, so 2 requests instead of one.
@Wambere how many rows are the small datasets? How many seconds does it take to complete for both create and update?
@peterMuriuki another design question, in these diagrams it looks like it's running syncronously in the request/response loop. If the importer takes a couple seconds, that seems like a problem, eg the request may timeout
Yeah that would be a concern as well, The implementation would have to include an async job orchestration aspect to it. We can do a light weight memory-based implementation that does not require persisting. It would be alive for as long as the service is up and have no recoverability.
Sort of like a HashMap of arbitrary job-ids to the jobs. {job-ids: JobInstance}.
@pld for 20 rows it's about 3 seconds for creation and 26 seconds for editing
The time increased when we removed the version column form the csv, so we are basically getting every single resource to get it's version, to use for the updating payload
OK so definitely needs async
Option 2 seems simpler to me, what am I missing on the downsides to this approach?
Option 2 seems simpler to me, what am I missing on the downsides to this approach?
Yeah, I started reconsidering my initial recommendation as well, The challenge in option 2 just had to do with parsing the script logs to generate the output, but we can mitigate this by defining what output and how to format the output so that it can be reliably parsed on fhir-web.
We can do option 2(start with a poc to see if there are any challenges that I might not have forseen).
On output changing that to something more structured, like JSON would be reasonable.Can you just out some of pros and cons for options 1 and 2 please?On Feb 15, 2024, at 03:01, Peter Muriuki @.***> wrote:
Option 2 seems simpler to me, what am I missing on the downsides to this approach?
Yeah, I started reconsidering my initial recommendation as well, The challenge in option 2 just had to do with parsing the script logs to generate the output, but we can mitigate this by defining what output and how to format the output so that it can be reliably parsed on fhir-web. We can do option 2.
—Reply to this email directly, view it on GitHub, or unsubscribe.You are receiving this because you were mentioned.Message ID: @.***>
@pld Option 2 | PROS | CONS |
---|---|---|
Option to authenticate importer requests using existing client authentication, A bit of a security layer that would protects access to the upload feature | Requires 2 runtimes, in a single code base | |
Easier to align this feature to how it will be used by the web | compatibility represents a bit of a challenge (https://github.com/onaio/fhir-tooling/issues/140) | |
Centrally distributed and deployed i.e from a single fhir-web image |
Option 1 | PROS | CONS |
---|---|---|
Separation of concerns dealianted along runtime environments | Distribution requires this to be an additional standalone service (An important quiz: Would we need to setup a new importer service instance for each fhir-web client instance?) | |
Require more work to protect api from abuse. | ||
@Wambere @dubdabasoduba you can add others not captured here, as of now, I think we can go with option 2.
Cool thanks @peterMuriuki! The first con for option 1 outweighs the other in my opinion. I'd rather start this as a single service then if/when we need to scale it separately horizontally we split it out
Refs:
Objective
Allow opensrp-2 web users to make an upload of post deployment data via the web interface. This currently exists as a standalone python script the importer. This issue is an RFC on how the back-end will be implemented, I have presented 3 options below for comment.
1. Add a restful api facade to the script
Add an api wrapper to the script. Means we can either use the importer script via the CLI or through a restful api service. We can bundle the service as a docker image which we can use as the base when building fhir-web's image.
The service will expose apis that allow web to:
Caveat: The service should expect that all the populated csv templates might be uploaded as a single request. Where such an upload takes time to complete, Web cannot block users from navigating elsewhere, so we might need provisions of checking if there are pending upload jobs and their progress status, which is information we would show on web if user came back to this view.
2. Call the importer script from express
Fhir-web has an express-server that runs on the server, we can expose an api that when hit will spawn a child process that would execute the importer script. The importer script can be bundled into fhir-web by way of a submodule
The api would provide the following:
std
to figure out:Parsing std might be a challenge and very brittle, since it's dependent on the script's log language and format.
3. Re-do the implementation in fhir-web.
Essentially rewrite the importer script functionality at fhir-web's express server layer. This would provide an api service that has similar requirements as to that in option 1.
Ofcourse this would mean duplicacy in maintenance efforts which is not ideal, and could possibly incur more initial effort to set this up, since we would still need to add the restful functionality that would also be required in option 1
====
In all the options some of the challenges that I think need to be solved:
====
The first option seems to be the most sane. cc @dubdabasoduba @Wambere @pld @HenryRae @joyce-x-chen