contact_detail apdex: Define benchmarking scenario

michaelkohn commented 7 months ago

Which config(s) will be used
Which device(s)
How many docs / data distribution for the user
Detailed clickpath for the scenario

latin-panda commented 7 months ago

Initial data analysis I made related to this work

Configs

We will run performance testing on the config we are familiar with, and covering 3 major regions:

Kenya (East Africa)
Nepal (Asia)
Togo (West Africa)

Devices

The phone use by Kenya deployment:

Brand: Neon Ray Ultra by Safaricom
RAM: 2GB - 1.2GB in use all the time.
Processor: octa-core
Android: 13
Storage: 22 - 8GM in use
Battery: 3750mAh
CHT-Android: 1.2

User type

offline CHW user

Data

For CHWs

1 CHW Area
100 household
6 members in each household
5 reports per member

Iterations

We will run the automation tests using at least 1 user and simulate 5 days, each day running 10 times the suite.

Clickpath

As an offline user, access the application and do the following navigation for each contact type:

CHW Area:
- Tap on the Contact's tab, and wait for the content to load
- Tap on the CHW Area that has at least 100 households, and wait for all content to load
Household:
- Tap on the Contact's tab
- Tap on a household with at least 6 members, and wait for all content to load
Patient:
- Tap on the Contact's tab
- Tap on a household with at least 6 members, and wait for all content to load
- Tap on a female patient with at least 5 reports and 3 tasks, and wait for all content to load At the end of the tests, advance 1 day and sync the app to generate the telemetry records.

latin-panda commented 7 months ago

@michaelkohn I still have to do the 4 Detailed click-path for the scenario, but you can have a look and let me know your thoughts

The Nepal data you see there is what I could see from Raphael's test in our test instance for that config. I haven't properly analyzed that full data (the other ticket works) because it wasn't syncing to Postgresql, so I added the questions marks.

michaelkohn commented 7 months ago

Thanks @latin-panda, great start. I don't know what this will all look like when we get to the point of publishing it, but this is really useful in starting to shake out all the details 🙏🏼

Config

It might be the case that the apdex score is already above .94 for some configs and not for others, that's totally fine. We don't need our baseline to represent every config scenario. Obviously it's most impactful if it helps more people, but it's also impactful if it just helps 1 big project.
When identifying the baseline config, I'd imagine we'll want to be able to have a snapshot of it saved with our tests so that it can be recreated even if we end up changing the config in the future
I don't think we need to document every nuance of the baseline config, but I do think it would be useful to briefly describe it in words to include some high level details like "contact summary has 5 fields on it, there are 2 condition cards", etc...

Devices

I'd imagine we'll want to provide the specs / versions of the testing phone

User type

During our call today, I mentioned that the "Supervisor" user should also be offline. I don't think the current test suite includes supervisors anyway and I'm OK with not including them for now... you mentioned it as a stretch goal which is fine, I'm also OK with removing it for now.

Data

We'll want to know how many reports there are per contact

Clickpath scenario

I imagine these will just be very simple like "Tap on a household from Contact Page List View" or "Tap on person from Contact Page Contact Detail View (Household)"
And like you have set out in the current performance duration section, I think we should evaluate each level separately. For example (using a typical CHW Area/Household/Person hierarchy), baseline for... CHW Area, Contact of CHW Area (this is probably just the user's contact/self), Household, Contact of the Household.

Data

I think we should be focusing on the apdex score for the given scenario (CHW Area, Contact of CHW Area, Household, Contact of Household). It's useful to see the counts because it gives a sense of scale to the tests but...
Apdex treats tolerable and frustrated differently in the calculation, so we need to differentiate tolerable from frustrated. For example... your apdex score will be different if you have 5 tolerable and 0 frustrated vs. 0 tolerable and 5 frustrated.

latin-panda commented 7 months ago

I'd imagine we'll want to be able to have a snapshot of it saved with our tests so that it can be recreated even if we end up changing the config in the future

@michaelkohn Do you mean like the Apdex result so we can compare later?

latin-panda commented 7 months ago

@michaelkohn I have updated the info based on your comments. Do you think something is missing? If we are good, I will close the ticket as completed.

Togo Info will have to wait until next week when the tests are ready

michaelkohn commented 7 months ago

I'd imagine we'll want to be able to have a snapshot of it saved with our tests so that it can be recreated even if we end up changing the config in the future

@michaelkohn Do you mean like the Apdex result so we can compare later?

Nope, i meant keeping a snapshot of the actual config from when the tests were executed... I wouldn't expect it to change in the short term, but I can imagine in the future we may need to change the config to either accommodate some cht-core change or an important change to a project's config.

Do you think something is missing?

Performance baseline

I actually think we should remove this section since it is covered in https://github.com/medic/care-teams/issues/69.

Clickpath

Thinking about this some more, do you think it's useful to add details like "load a contacts list that has 100 hh's in it, select the 5th household"....

latin-panda commented 7 months ago

Nope, i meant keeping a snapshot of the actual config from when the tests were executed.

Yes, we have a snapshot of every config in the Fast UI folder.

I've applied the last feedback and closed this as complete.

medic / care-teams