[HOLD for payment 2024-11-05] [HOLD for payment 2024-10-29] [Performance] Typing in search is very laggy for big accounts

hannojg commented 2 months ago

If you haven’t already, check out our contributing guidelines for onboarding and email contributors@expensify.com to request to join our Slack channel!

What performance issue do we need to solve?

On a very big account typing in search for the first time is incredibly laggy, see recording. The user is continuously trying to type but the application is dropping so many frames that the text is only updated in jumps:

https://github.com/user-attachments/assets/ed7e76b6-fcce-4dd2-83ae-de5029baab09

What is the impact of this on end-users?

Very slow and frustrating search experience. Borderline functional.

List any benchmarks that show the severity of the issue

The customer shared a profile trace with us:

Firefox 2024-07-25 10.42 profile.json.gz

(note: the trace also contains other test cases as well)

The customer had ~15k reports loaded in onyx when these tests were conducted, although in focus mode only 6 chats were displayed.

Proposed solution (if any)

None yet, I will go through the profile and see what can be optimised, what exactly caused those lags.

List any benchmarks after implementing the changes to show impacts of the proposed solution (if any)

not available yet

Platforms:

Which of our officially supported platforms is this issue occurring on?

[ ] Android: Native
[ ] Android: mWeb Chrome
[ ] iOS: Native
[ ] iOS: mWeb Safari
[x] MacOS: Chrome / Safari
[ ] MacOS: Desktop

Version Number: v9.0.11-5 Reproducible in staging?: not tested Reproducible in production?: yes Email or phone of affected tester (no customers): customer Logs: See performance file Notes/Photos/Videos: See attached video Expensify/Expensify Issue URL: n/a Issue reported by:@hannojg Slack conversation:https://expensify.slack.com/archives/C05LX9D6E07/p1721919928992729

View all open jobs on Upwork

Issue Owner

Current Issue Owner: @sakluger

melvin-bot[bot] commented 2 months ago

Auto-assigning issues to engineers is no longer supported. If you think this issue should receive engineering attention, please raise it in #whatsnext.

hannojg commented 2 months ago

cc @sakluger (feel free to assign me as I (or someone from my team) will work on this ticket!)

hannojg commented 2 months ago

Screenshot 2024-08-05 at 13 17 46

Just started working on this issue. I can see that OptionListUtils.filterOptions is particularly slow. The reduceRight function (which is called as of filterOptions) took 10s to complete for the customer.

I can kinda reproduce the lag if I enable lower performance settings in chrome (basically bringing my Cpu closer to the one the customer had). Working on ideas for improving the performance here!

hannojg commented 2 months ago

Created a PR that adds performance timings, so we can track upcoming performance improvements better:

https://github.com/Expensify/App/pull/46807

hannojg commented 2 months ago

Okay, I have a proposal for a solution moving forward:

Highlevel solution: use a trie structure for search

Right now our search data structure + search algorithm aren't really efficient. We loop over every item performing multiple comparison operations.

We can make this way more efficient by using a trie structure and prefix search.

Performance gains

I ran a test with the customers onyx data, right now only focusing on searching through the personal details.

Device: Samsung S21
Personal detail count: 16 039
Only including personal details (will optimise report search later by same solution)
Running 10 test iterations

	Before: `OptionsListUtils.getSearchOptions`	✨ After: using trie structure
mean	299.97ms	0.49ms
stdev	6.5ms	0.36ms

So, right now this user on mobile will have a dead JS thread for ~300ms when typing in the search (dropping 18 frames), while with a trie, its just ~0.5ms, so no frames dropped really. Thats ~600x faster. For the customer where before this part of the function took 10s, this would only take ~16ms now approximately.

Tests were conducted here on this branch

Changes needed / roadmap

We need to implement a trie structure that supports searching for substrings as well (our current trie implementation doesn't support that)
We need to calculate the trie beforehand (it takes a bit time and space to calculate it)
We need to calculate two tries I think, one for reports, one for personal details
We might want to revisit our chain of getting search results, which currently is:
- const options = OptionsListUtils.createOptionList(personalDetails, reports)
- const searchOptions = OptionsListUtils.getSearchOptions(options, '', betas)
- const filteredOptions = OptionsListUtils.filterOptions(searchOptions, searchValue)
- ^ This can maybe be optimised into less steps, or getSearchOptions would be replaced with creating the search trie
Wire everything together

Caveats

Right now building the trie takes some time and a fair amount of memory. I want to experiment a bit more using other DS such as generalised suffix tree or suffix arrays, and see which one works best!

melvin-bot[bot] commented 2 months ago

Reviewing label has been removed, please complete the "BugZero Checklist".

melvin-bot[bot] commented 2 months ago

The solution for this issue has been :rocket: deployed to production :rocket: in version 9.0.17-2 and is now subject to a 7-day regression period :calendar:. Here is the list of pull requests that resolve this issue:

https://github.com/Expensify/App/pull/46807

If no regressions arise, payment will be issued on 2024-08-14. :confetti_ball:

For reference, here are some details about the assignees on this issue:

@hannojg does not require payment (Contractor)

hannojg commented 2 months ago

Started a discussion here how we want to move forward technically:

https://expensify.slack.com/archives/C05LX9D6E07/p1723209631007999

sakluger commented 2 months ago

No payment required for the above PR:

sakluger commented 2 months ago

We're still discussing the best way forward in that Slack thread.

hannojg commented 2 months ago

I just opened a PR hat shows a PoC for using web workers for running the search, which will unblock the main thread:

https://github.com/Expensify/App/pull/47432

I am now seeking approval on this new technical design here.

(Note: I am OOO for the rest of the week and will continue next week Monday 19.08)

sakluger commented 2 months ago

Moving to weekly to make Melvin happy, we can move it back next week.

sakluger commented 2 months ago

@hannojg any updates on this one?

hannojg commented 2 months ago

It has been discussed here that our first proposal to offload the work to a different thread has been postponed (as we want to stay single threaded as long as possible).

We are now working on a new proposal to use a more efficient data structure and a very fast search algorithm. Will post the proposal and a PoC today/trmrw.

sakluger commented 1 month ago

I asked for updates in the main tracking issue: https://github.com/Expensify/App/issues/46595

sakluger commented 1 month ago

Looks like we have a WIP PR that will speed up search by 2456x.

marcaaron commented 1 month ago

🤯 that's sick!

sakluger commented 1 month ago

Looks like the draft PR is coming along! I see commits and reviews.

hannojg commented 1 month ago

I am back from OOO and working on it - expect to have a reviewable PR by tmrw!

hannojg commented 1 month ago

Status update:

I made good progress on the PR, its cleaned up, has tests and documentation and is very close to be completed
There are two left todos:
- One reassure test fails currently, indicating i introduced more renders than previously - i believe this to be a mistake, have to investigate
- I found one edge case where the search function is currently returning not all results, investigating

Future Improvements:

Building the tree can take some time on lower end devices (taking in millisecond range here - will get definite numbers soon). As an optimisation we could store the search tree in mmkv using array buffers. I believe this to be a really good optimisation for lower end mobile devices - but i don't want to convolute the current PR and will open a follow up ticket. (Additionally i am trying to see how the construction of the tree can be further improved)

TL;DR: Very positive to have the PR ready and tested by TMRW!

hannojg commented 1 month ago

Status update:

We weren't happy with the build time performance of the tree on android devices and spend time optimising it
Main finding is that ArrayBuffers are way faster than regular arrays, so we migrated the tree structure to that (that would also enable us in the future to potentially share the array buffers from another thread if we want to go multi threaded)
Build time in heavy load scenario improved from ~600ms to ~90ms (which is work we only have to do once)
I am working on a revised version with array buffers here, will update the main PR tmrw and finish it off!

hannojg commented 1 month ago

PR is ready for review now!

Whats missing in the current PR is to save the created "FastSearch" (tree) instance to a context, so it can be reused (so that we don't need to recreate the tree from scratch every time our personal details / reports. change). I figured that this would add another 200-400 lines of code and I wanted to make the lives of people reviewing not too hard - I will open an issue and do that as a follow up.

sakluger commented 1 month ago

This is exciting, thanks for the updates @hannojg! I can't wait to see these improvements go live, it'll be a big win.

marcaaron commented 3 weeks ago

Nice. Just catching up after OOO, but looks really impressive overall!

sakluger commented 1 week ago

The PR has been merged!!!!!!

hannojg commented 1 week ago

Awesome! I will open up a few follow up issues, as the work with that PR isn't done yet. There are some optimisations which we are still missing

hannojg commented 1 week ago

One regression was reported which i am going to look at asap today:

https://github.com/Expensify/App/issues/51123

I have created a list of follow up issues to make the new search implementation rock solid:

melvin-bot[bot] commented 1 week ago

⚠️ Looks like this issue was linked to a Deploy Blocker here

If you are the assigned CME please investigate whether the linked PR caused a regression and leave a comment with the results.

If a regression has occurred and you are the assigned CM follow the instructions here.

If this regression could have been avoided please consider also proposing a recommendation to the PR checklist so that we can avoid it in the future.

ahmedGaber93 commented 1 week ago

@sakluger Can you assign me as C+ as I reviewed the PR?

ahmedGaber93 commented 1 week ago

The PR https://github.com/Expensify/App/pull/48652 is reverted here https://github.com/Expensify/App/pull/51177 due to this regression https://github.com/Expensify/App/issues/51175.

I think this not a big regression, and we intentionally skip whitespace chars in this role https://github.com/Expensify/App/pull/48652/files#diff-a919e8da171bd4fc998da5203188843b1e0587eb3fb2c809e9f235ce54cdf8cbR19, but I think we need to allow it as it affected on this case https://github.com/Expensify/App/issues/51175#issuecomment-2426694744

CC @hannojg

melvin-bot[bot] commented 1 week ago

Reviewing label has been removed, please complete the "BugZero Checklist".

melvin-bot[bot] commented 1 week ago

The solution for this issue has been :rocket: deployed to production :rocket: in version 9.0.51-4 and is now subject to a 7-day regression period :calendar:. Here is the list of pull requests that resolve this issue:

https://github.com/Expensify/App/pull/48652

If no regressions arise, payment will be issued on 2024-10-29. :confetti_ball:

For reference, here are some details about the assignees on this issue:

@hannojg does not require payment (Contractor)

hannojg commented 6 days ago

Update: It seems that the PR has been reverted.

This PR is the second attempt to merge:

https://github.com/Expensify/App/pull/51332

as the original PR has been reverted for this issue:

https://github.com/Expensify/App/issues/51175

I believe the revert to be a mistake, as the functionality to search chats by their participants had earlier been removed here (please correct me if i am wrong):

https://github.com/Expensify/App/pull/47229

marcaaron commented 5 days ago

Yes, I think that's correct. Something about that seemed weird to me...

There was also another deploy blocker, but seems like people had trouble reproducing that one.

marcaaron commented 5 days ago

Merged it.

melvin-bot[bot] commented 3 days ago

⚠️ Looks like this issue was linked to a Deploy Blocker here

If you are the assigned CME please investigate whether the linked PR caused a regression and leave a comment with the results.

If a regression has occurred and you are the assigned CM follow the instructions here.

If this regression could have been avoided please consider also proposing a recommendation to the PR checklist so that we can avoid it in the future.

hannojg commented 1 day ago

I don't think this PR caused the deploy blocker + i can't reproduce the linked issue.

melvin-bot[bot] commented 19 hours ago

⚠️ Looks like this issue was linked to a Deploy Blocker here

If you are the assigned CME please investigate whether the linked PR caused a regression and leave a comment with the results.

If a regression has occurred and you are the assigned CM follow the instructions here.

If this regression could have been avoided please consider also proposing a recommendation to the PR checklist so that we can avoid it in the future.

sakluger commented 14 hours ago

I'm a bit confused about the status of the PR as well as how to handle payment.

The original PR (https://github.com/Expensify/App/pull/48652) was merged, then reverted, then we reverted the revert. So that means that the changes are in fact live?

I see @ahmedGaber93 and @allgandalf reviewed the PR. Do they both require payment?

And for those payments, should we decrease payment because of the regressions? Or is it unclear what caused those regressions?

To summarize, I'd love it someone could help me figure out who needs to be paid and how much 😄

ahmedGaber93 commented 14 hours ago

Currently, the PR was reverted again due to this regression https://github.com/Expensify/App/issues/51584. No payment is needed into this moment.

melvin-bot[bot] commented 11 hours ago

⚠️ Looks like this issue was linked to a Deploy Blocker here

If you are the assigned CME please investigate whether the linked PR caused a regression and leave a comment with the results.

If a regression has occurred and you are the assigned CM follow the instructions here.

If this regression could have been avoided please consider also proposing a recommendation to the PR checklist so that we can avoid it in the future.

melvin-bot[bot] commented 5 hours ago

Reviewing label has been removed, please complete the "BugZero Checklist".

melvin-bot[bot] commented 5 hours ago

The solution for this issue has been :rocket: deployed to production :rocket: in version 9.0.54-11 and is now subject to a 7-day regression period :calendar:. Here is the list of pull requests that resolve this issue:

https://github.com/Expensify/App/pull/51332

If no regressions arise, payment will be issued on 2024-11-05. :confetti_ball:

For reference, here are some details about the assignees on this issue:

@hannojg does not require payment (Contractor)
@ahmedGaber93 requires payment (Needs manual offer from BZ)

Expensify / App