Open hannojg opened 2 months ago
Auto-assigning issues to engineers is no longer supported. If you think this issue should receive engineering attention, please raise it in #whatsnext.
cc @sakluger (feel free to assign me as I (or someone from my team) will work on this ticket!)
Just started working on this issue. I can see that OptionListUtils.filterOptions
is particularly slow. The reduceRight
function (which is called as of filterOptions
) took 10s to complete for the customer.
I can kinda reproduce the lag if I enable lower performance settings in chrome (basically bringing my Cpu closer to the one the customer had). Working on ideas for improving the performance here!
Created a PR that adds performance timings, so we can track upcoming performance improvements better:
Okay, I have a proposal for a solution moving forward:
Right now our search data structure + search algorithm aren't really efficient. We loop over every item performing multiple comparison operations.
We can make this way more efficient by using a trie structure and prefix search.
I ran a test with the customers onyx data, right now only focusing on searching through the personal details.
Before: OptionsListUtils.getSearchOptions |
✨ After: using trie structure | |
---|---|---|
mean | 299.97ms | 0.49ms |
stdev | 6.5ms | 0.36ms |
So, right now this user on mobile will have a dead JS thread for ~300ms when typing in the search (dropping 18 frames), while with a trie, its just ~0.5ms, so no frames dropped really. Thats ~600x faster. For the customer where before this part of the function took 10s, this would only take ~16ms now approximately.
Tests were conducted here on this branch
const options = OptionsListUtils.createOptionList(personalDetails, reports)
const searchOptions = OptionsListUtils.getSearchOptions(options, '', betas)
const filteredOptions = OptionsListUtils.filterOptions(searchOptions, searchValue)
getSearchOptions
would be replaced with creating the search trieRight now building the trie takes some time and a fair amount of memory. I want to experiment a bit more using other DS such as generalised suffix tree or suffix arrays, and see which one works best!
Reviewing
label has been removed, please complete the "BugZero Checklist".
The solution for this issue has been :rocket: deployed to production :rocket: in version 9.0.17-2 and is now subject to a 7-day regression period :calendar:. Here is the list of pull requests that resolve this issue:
If no regressions arise, payment will be issued on 2024-08-14. :confetti_ball:
For reference, here are some details about the assignees on this issue:
Started a discussion here how we want to move forward technically:
https://expensify.slack.com/archives/C05LX9D6E07/p1723209631007999
We're still discussing the best way forward in that Slack thread.
I just opened a PR hat shows a PoC for using web workers for running the search, which will unblock the main thread:
I am now seeking approval on this new technical design here.
(Note: I am OOO for the rest of the week and will continue next week Monday 19.08)
Moving to weekly to make Melvin happy, we can move it back next week.
@hannojg any updates on this one?
It has been discussed here that our first proposal to offload the work to a different thread has been postponed (as we want to stay single threaded as long as possible).
We are now working on a new proposal to use a more efficient data structure and a very fast search algorithm. Will post the proposal and a PoC today/trmrw.
I asked for updates in the main tracking issue: https://github.com/Expensify/App/issues/46595
Looks like we have a WIP PR that will speed up search by 2456x.
🤯 that's sick!
Looks like the draft PR is coming along! I see commits and reviews.
I am back from OOO and working on it - expect to have a reviewable PR by tmrw!
Status update:
Future Improvements:
Building the tree can take some time on lower end devices (taking in millisecond range here - will get definite numbers soon). As an optimisation we could store the search tree in mmkv using array buffers. I believe this to be a really good optimisation for lower end mobile devices - but i don't want to convolute the current PR and will open a follow up ticket. (Additionally i am trying to see how the construction of the tree can be further improved)
TL;DR: Very positive to have the PR ready and tested by TMRW!
Status update:
600ms
to ~90ms
(which is work we only have to do once)PR is ready for review now!
Whats missing in the current PR is to save the created "FastSearch" (tree) instance to a context, so it can be reused (so that we don't need to recreate the tree from scratch every time our personal details / reports. change). I figured that this would add another 200-400 lines of code and I wanted to make the lives of people reviewing not too hard - I will open an issue and do that as a follow up.
This is exciting, thanks for the updates @hannojg! I can't wait to see these improvements go live, it'll be a big win.
Nice. Just catching up after OOO, but looks really impressive overall!
The PR has been merged!!!!!!
Awesome! I will open up a few follow up issues, as the work with that PR isn't done yet. There are some optimisations which we are still missing
One regression was reported which i am going to look at asap today:
I have created a list of follow up issues to make the new search implementation rock solid:
⚠️ Looks like this issue was linked to a Deploy Blocker here
If you are the assigned CME please investigate whether the linked PR caused a regression and leave a comment with the results.
If a regression has occurred and you are the assigned CM follow the instructions here.
If this regression could have been avoided please consider also proposing a recommendation to the PR checklist so that we can avoid it in the future.
@sakluger Can you assign me as C+ as I reviewed the PR?
The PR https://github.com/Expensify/App/pull/48652 is reverted here https://github.com/Expensify/App/pull/51177 due to this regression https://github.com/Expensify/App/issues/51175.
I think this not a big regression, and we intentionally skip whitespace chars in this role https://github.com/Expensify/App/pull/48652/files#diff-a919e8da171bd4fc998da5203188843b1e0587eb3fb2c809e9f235ce54cdf8cbR19, but I think we need to allow it as it affected on this case https://github.com/Expensify/App/issues/51175#issuecomment-2426694744
CC @hannojg
Reviewing
label has been removed, please complete the "BugZero Checklist".
The solution for this issue has been :rocket: deployed to production :rocket: in version 9.0.51-4 and is now subject to a 7-day regression period :calendar:. Here is the list of pull requests that resolve this issue:
If no regressions arise, payment will be issued on 2024-10-29. :confetti_ball:
For reference, here are some details about the assignees on this issue:
Update: It seems that the PR has been reverted.
This PR is the second attempt to merge:
as the original PR has been reverted for this issue:
I believe the revert to be a mistake, as the functionality to search chats by their participants had earlier been removed here (please correct me if i am wrong):
Yes, I think that's correct. Something about that seemed weird to me...
There was also another deploy blocker, but seems like people had trouble reproducing that one.
Merged it.
⚠️ Looks like this issue was linked to a Deploy Blocker here
If you are the assigned CME please investigate whether the linked PR caused a regression and leave a comment with the results.
If a regression has occurred and you are the assigned CM follow the instructions here.
If this regression could have been avoided please consider also proposing a recommendation to the PR checklist so that we can avoid it in the future.
I don't think this PR caused the deploy blocker + i can't reproduce the linked issue.
⚠️ Looks like this issue was linked to a Deploy Blocker here
If you are the assigned CME please investigate whether the linked PR caused a regression and leave a comment with the results.
If a regression has occurred and you are the assigned CM follow the instructions here.
If this regression could have been avoided please consider also proposing a recommendation to the PR checklist so that we can avoid it in the future.
I'm a bit confused about the status of the PR as well as how to handle payment.
The original PR (https://github.com/Expensify/App/pull/48652) was merged, then reverted, then we reverted the revert. So that means that the changes are in fact live?
I see @ahmedGaber93 and @allgandalf reviewed the PR. Do they both require payment?
And for those payments, should we decrease payment because of the regressions? Or is it unclear what caused those regressions?
To summarize, I'd love it someone could help me figure out who needs to be paid and how much 😄
Currently, the PR was reverted again due to this regression https://github.com/Expensify/App/issues/51584. No payment is needed into this moment.
⚠️ Looks like this issue was linked to a Deploy Blocker here
If you are the assigned CME please investigate whether the linked PR caused a regression and leave a comment with the results.
If a regression has occurred and you are the assigned CM follow the instructions here.
If this regression could have been avoided please consider also proposing a recommendation to the PR checklist so that we can avoid it in the future.
Reviewing
label has been removed, please complete the "BugZero Checklist".
The solution for this issue has been :rocket: deployed to production :rocket: in version 9.0.54-11 and is now subject to a 7-day regression period :calendar:. Here is the list of pull requests that resolve this issue:
If no regressions arise, payment will be issued on 2024-11-05. :confetti_ball:
For reference, here are some details about the assignees on this issue:
If you haven’t already, check out our contributing guidelines for onboarding and email contributors@expensify.com to request to join our Slack channel!
What performance issue do we need to solve?
On a very big account typing in search for the first time is incredibly laggy, see recording. The user is continuously trying to type but the application is dropping so many frames that the text is only updated in jumps:
https://github.com/user-attachments/assets/ed7e76b6-fcce-4dd2-83ae-de5029baab09
What is the impact of this on end-users?
Very slow and frustrating search experience. Borderline functional.
List any benchmarks that show the severity of the issue
The customer shared a profile trace with us:
Firefox 2024-07-25 10.42 profile.json.gz
(note: the trace also contains other test cases as well)
The customer had ~15k reports loaded in onyx when these tests were conducted, although in focus mode only 6 chats were displayed.
Proposed solution (if any)
None yet, I will go through the profile and see what can be optimised, what exactly caused those lags.
List any benchmarks after implementing the changes to show impacts of the proposed solution (if any)
not available yet
Platforms:
Which of our officially supported platforms is this issue occurring on?
Version Number: v9.0.11-5 Reproducible in staging?: not tested Reproducible in production?: yes Email or phone of affected tester (no customers): customer Logs: See performance file Notes/Photos/Videos: See attached video Expensify/Expensify Issue URL: n/a Issue reported by:@hannojg Slack conversation:https://expensify.slack.com/archives/C05LX9D6E07/p1721919928992729
View all open jobs on Upwork
Issue Owner
Current Issue Owner: @sakluger