tasks:refresh apdex: research and development

latin-panda commented 1 month ago

Refresh a couple of things:

This Apdex score tracks from ngOnInit to when the refreshTasks promise has been finalized.
Rules engine is the process that generates tasks
From the test party
- tasks:refresh(frustrated: 276, tolerated: 40, score 0.09) --- about 96% of the time, it wasn't satisfactory
- tasks:load(frustrated 18, tolerated: 22, score 0.77) --- about 32% of the time, it wasn't satisfactory.

I checked the code that is in the scope of the Apdex metric (task:refresh)

The code in rules engine: It's convoluted, hard to track, there are several calls to the database, but I didn't find anything that is safe to reduce or avoid.
The code that is not rules engines:
- Some of the tiny improvements were done as part of the refactor in the contact details.
- However, the task refresh process is run twice after submitting a task form. It's reacting to when the contact becomes "dirty" and when the doc in the db changes (doc is a task, doc is a report, or doc is a contact). We have something to wait for 1 second, like canceling the process if another one is triggered within 1 second. But in my fast machine, 1 second doesn't seem enough and it still runs the process twice. Old devices will take longer than that. We can try to find a way to run the process once somehow.

latin-panda commented 1 month ago

Following up on last week's update: We can try to find a way to run the process once somehow. Running the process once won't necessarily lower the Apdex Score, but still, it'd be nice to have fewer processes running for the page.

So far, the process needs to run twice. After submitting a task, it needs to evaluate dirty contacts and reports and create a doc of the type task; then it runs again to retrieve the updated list of tasks but now considers the new task doc/ or the new state of the task, on the internal evaluations.

Tech talk:

After submitting a task, the change service detects a doc of type data_record and updates the emissions (this.rulesEngineCore.updateEmissionsFor(subjectIds)).
After that's done, it fetches tasks (this.rulesEngineCore.fetchTasksFor), which ends up creating a doc of type task
The change service detects that new doc of type task and fetches the tasks again this.rulesEngineCore.fetchTasksFor

3 main things happen in tasks:refresh code:

Check rules engine is enabled - duration is fine. Max 3 millisecs
Fetch tasks docs for all contacts - it gets the items to display in the list and what takes the most time to complete.
Fetch report subjects - it gets the info to display the breadcrumbs of each task. The duration is decent in a high-end device 7 millisecs and in a low-end device 32 millisecs

The "fetch tasks docs for all contacts" is a process in the rules engine. It's what takes most of the time by far, specifically, these sub-processes:

refreshAndSave - 41%~ of the total time to fetch tasks
taskDataFor - 32 %~
tasksByRelation - 16%~ This was calculated by using performance.now() in each of those functions.

latin-panda commented 1 month ago

Some more details

Angular Dev tool

The Angular Debug tool doesn't highlight any concerns in the code we track with `tasks:refresh`. The rules engine runs outside Angular's zone, i.e., outside Angular's scope. Running code outside Angular's zone is supposed to improve performance. We are good on this one.

Rules engine

Checking the refreshAndSave, taskDataFor, and tasksByRelation, they seem to be doing what's needed:

After the user logins (tasks:load)

The rules engine starts working and saves a state of contacts (if dirty or not) and targets

Remember, after login, the initial replication happens, and while it gets all the data from the server, it can take a while to warm up, depending on the device specs and internet speed. The rules engine state is saved in the device database; it initially has no contact mapped. ![Screenshot 2024-05-23 at 5 32 10 PM](https://github.com/medic/care-teams/assets/66472237/4fac009e-c035-42f5-ba48-9ebeec28838f) After 2 mins (high-end device, less than 52 docs to download), the rules engine fetches 38 contacts, gets reports by subject, and gets tasks by contacts which id starts with `requester`: ![Screenshot 2024-05-23 at 5 38 09 PM](https://github.com/medic/care-teams/assets/66472237/01ef6951-8882-4538-9a61-ea5035e75f79) Once it has some data, it updates the rules engine state doc to include updates of targets and then the contacts: ![Screenshot 2024-05-23 at 5 39 38 PM](https://github.com/medic/care-teams/assets/66472237/29c86580-f729-415e-a17c-682dc36c21f9) Lastly, it fetches tasks but to get `owner` ones: ![Screenshot 2024-05-23 at 5 41 41 PM](https://github.com/medic/care-teams/assets/66472237/5aa8d8a7-0016-4e03-ab45-606594a73688)

When the user navigates to the Tasks list and submits a task form (tasks:refresh)

The tasks:refresh process starts. The rules engine state doc (_local/rulesStateStore) is updated 3 times. This is the sequence when it comes to database interactions done by the rules engine:

💿 After submitting a task, the change service detects a doc of type `data_record` and updates the emissions (rulesEngineCore.updateEmissionsFor). It fetches contacts by reference. In this scenario, the ones related to the dirty contact

![Screenshot 2024-05-23 at 6 05 51 PM](https://github.com/medic/care-teams/assets/66472237/ee926221-2189-42db-8ae7-8f8f63c38c22)

First save: The emissions in the `targetState` of the rules engine state doc are updated. This doc was saved fast (10 millisecs)

Notice the `order` property is different Screenshot 2024-05-23 at 6 13 21 PM

💿 At this point, the emissions have finished updating. Now, the rule engines fetch tasks by contact to get the `owner`

![Screenshot 2024-05-23 at 6 39 11 PM](https://github.com/medic/care-teams/assets/66472237/cfbf0058-8e70-4f74-931a-71bb7a0aad1e)

💿 The task list gets notified that there is a dirty contact and wants to refresh the list. It uses the rules engine to fetch tasks for all contacts. There's a query for tasks by `requester-(dirty contact id)` and then to get reports for that dirty contact

![Screenshot 2024-05-23 at 6 41 41 PM](https://github.com/medic/care-teams/assets/66472237/77a6c86d-8f2b-44cd-9406-eb613a264cc9) ![Screenshot 2024-05-23 at 6 42 38 PM](https://github.com/medic/care-teams/assets/66472237/8c92a70b-23bc-447e-8d57-a17dc650f2ea)

🆗 Second and third save: The rule engine updates marks the contact as dirty, and updates the emissions too, which later causes the rule engine to recalculate tasks.

Left is the before. Screenshot 2024-05-23 at 5 57 16 PM

![Screenshot 2024-05-23 at 6 57 14 PM](https://github.com/medic/care-teams/assets/66472237/caeceafd-6c84-462f-ae26-82c7222886db)

💿 Finally, the task list detects a new task doc and refreshes the list.

yes, same request as the last one ![Screenshot 2024-05-23 at 6 58 25 PM](https://github.com/medic/care-teams/assets/66472237/460f2bf9-f866-4b08-af94-0135dcdbda46)

latin-panda commented 1 month ago

@michaelkohn @Benmuiruri I added a more technical comment. I hope it helps; I tried to explain it, but it will make more sense once looking at the code for a bit.

In short, I haven't found anything significant to improve the rules engines' performance. The code is tangled, which can make it a bit tricky to change. It can open the door to bugs in tasks and targets.

latin-panda commented 1 week ago

@michaelkohn What do you recommend?

We could try to reimagine tasks somehow
We can wait for the insights after @ralfudx gets the baseline. Maybe it's not as bad as the initial numbers from the team test party.
_

michaelkohn commented 1 week ago

I wouldn't want to start any work without getting the baseline, but I'm also curious if any other devs have any ideas/things to explore?

medic / care-teams