Expensify / App

Welcome to New Expensify: a complete re-imagination of financial collaboration, centered around chat. Help us build the next generation of Expensify by sharing feedback and contributing to the code.
https://new.expensify.com
MIT License
3.51k stars 2.86k forks source link

[Performance] Get&returnValueList=chatList takes a long time for users with many reports #7941

Closed marcaaron closed 2 years ago

marcaaron commented 2 years ago

If you haven’t already, check out our contributing guidelines for onboarding and email contributors@expensify.com to request to join our Slack channel!


What performance issue do we need to solve?

This is a backend issue that disproportionately affects users with large numbers of reports.

What is the impact of this on end-users?

TTFB for Get&returnValueList=chatList is pretty bad which can lead to very slow initializing of the app.

List any benchmarks that show the severity of the issue

Log search:

David getting chatList - 1528ms - Logs Marc getting chatList - 4ms - Logs

Proposed solution (if any)

The query in question is here. My guess would be that David has a very large number of reports shared to them compared to myself and that searching through the sharedReports is what slows this down.

List any benchmarks after implementing the changes to show impacts of the proposed solution (if any)

We can time these queries after making any improvements.

Platform:

Where is this issue occurring?

Version Number: Logs: https://stackoverflow.com/c/expensify/questions/4856 Notes/Photos/Videos: Any additional supporting documentation Expensify/Expensify Issue URL:

View all open jobs on Upwork

MelvinBot commented 2 years ago

Triggered auto assignment to @nkuoch (Engineering), see https://stackoverflow.com/c/expensify/questions/4319 for more details.

mallenexpensify commented 2 years ago

I re-added Engineering to assign someone so that it didn't get lost in the pool @nkuoch feel free to return to the pool if ya need/want

nkuoch commented 2 years ago

Yes sorry, my plate is full, I don't have time for it.

thienlnam commented 2 years ago

This is an interesting one, though on worse case with David's amount of reports it takes ~1.5 seconds which is slow but shouldn't be causing a huge delay in load time.

I checked out the query and it doesn't seem like there is a ton we could do to speed it up besides maybe adding a view and another index. Another idea is maybe we can paginate these results?

quinthar commented 2 years ago

Can you link to the query here?

thienlnam commented 2 years ago

Can you link to the query here?

It's in the issue description but here it is

flodnv commented 2 years ago

We are discussing this here: https://expensify.slack.com/archives/C03TQ48KC/p1646931885173749

danieldoglas commented 2 years ago

@johnmlee101 sent an email to SQLite team to check one of the hypothesis we think might help.

danieldoglas commented 2 years ago

@marcaaron Just adding to the discussion that this might also be related to the slow while initializing the app.

marcaaron commented 2 years ago

Hey @danieldoglas, mind summarizing what aspect of that discussion relates to to slowness? Didn't quite understand, thanks!

quinthar commented 2 years ago

That sounds like a pretty serious amount of maintenance, what would you think of taking a server down, applying this change and then timing the queries? It seems like we can run both queries back to back a few times and verify

On Fri, Mar 11, 2022, 12:42 PM Marc Glasser @.***> wrote:

Hey @danieldoglas https://github.com/danieldoglas, mind summarizing what aspect of that discussion relates to to slowness? Didn't quite understand, thanks!

— Reply to this email directly, view it on GitHub https://github.com/Expensify/App/issues/7941#issuecomment-1065507266, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAEMNUXXZBS2MDIPXI6KFCTU7OVZTANCNFSM5PSFCAOQ . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.

You are receiving this because you commented.Message ID: @.***>

danieldoglas commented 2 years ago

@marcaaron , we're discussing that in the open-source channel. https://expensify.slack.com/archives/C01GTK53T8Q/p1647032346220759

But that's mainly from the App perspective, not from Auth only. I'll post a summary when we reach a mature place in the discussion.

danieldoglas commented 2 years ago

Will change this to monthly since we're treating it internally.

danieldoglas commented 2 years ago

@marcaaron , I think we should close this considering we have our plans for the new API?

marcaaron commented 2 years ago

Oh, yes? Which part of the new API plans gives us reason to close this?

danieldoglas commented 2 years ago

Correct me if I'm wrong, but we're going to rewrite the methods to try to keep it 1 JS call to 1 PHP to 1 Auth. That being considered, we will rework this whole method, right?

marcaaron commented 2 years ago

Ah yes, but this method is just calling Auth directly via PHP. All the logic is in Auth.

danieldoglas commented 2 years ago

The place where we use this method (chatList) will actually have to add the chat details and IOU with it, or else we won't fulfill the premise of 1:1 call to PHP. So this method will be fully rebuilt to fulfill the 1:1 premise.

If we decide that we will just move from JS to PHP to do the N calls we currently do in the app, then we won't actually change the Auth code for this method, but in this specific case I think that will not be the best option.

marcaaron commented 2 years ago

Ok I am getting it now, sorry 😄

restating:

If we are combining the various API requests we must fetch to populate the reports/sidebar in NewDot then it will be better to do some kind of complete refactor where we get rid of Get&rvl=chatList entirely and have it all happen in Auth. And we should do any optimization when we do that work.

Sound right?

marcaaron commented 2 years ago

Sounds kind of like what @cead22 is looking into so maybe this can be merged into that other issue

https://github.com/Expensify/Expensify/issues/206455#issuecomment-1112472297

danieldoglas commented 2 years ago

Sorry, I think I was not clear enough in the beginning of this. Yep, that restated version is perfect. :)

cead22 commented 2 years ago

Works for me, what I'm trying to do first is replace this code, with a call to a new API command Get_Chats that returns all the info necessary.

iwiznia commented 2 years ago

keep it 1 JS call to 1 PHP to 1 Auth

To clarify, we decided to only focus on 1:1 calls from JS to PHP. It remains to be seen if having 1:1 calls from PHP to auth is necessary or even desired (I think in several cases, it is not desired, especially for when we are doing a write following by one or more reads, but even in other scenarios, since we don't gain much from unifying calls to auth from PHP).

flodnv commented 2 years ago

I agree that on the API side, in general we should aim for:

  1. 1 PHP call = 1 Auth call
  2. If that's really challenging, 1 PHP call = 1 Write Auth call + N read calls
  3. Under no circumstance should we aim for 1 PHP call = N Write Auth calls
iwiznia commented 2 years ago

I agree with that. Having said that, not sure when/if we will re-write existing calls that do not follow that.

danieldoglas commented 2 years ago

So, that being considered, should we close this or leave it to work on it after the 1:1 api calls?

iwiznia commented 2 years ago

It's already here https://github.com/Expensify/Expensify/issues/211669 so maybe we close

marcaaron commented 2 years ago

Putting this issue on HOLD makes sense as it relates to performance. Refactoring the API to send the correct Onyx data will probably go through changes like:

  1. @cead22's nice refactoring around fetchAllReports() happens
  2. Next, we take the resulting command and do the Optimized API doc changes to get a new API command

After that's done we can look into improving the performance?

danieldoglas commented 2 years ago

This is on hold until after refactoring fetchAllReports.

danieldoglas commented 2 years ago

I think #fireroom-08-02-22 will also tackle this, since there was some discussion about pagination

danieldoglas commented 2 years ago

@cead22 PR for GetChats was merged and deployed to PROD 3 days ago. I think we can wait a little bit to come back to this based on our experience with the new methods.

@quinthar I think your feedback will be especially important in this

danieldoglas commented 2 years ago

Closing this so we can use the internal issue to continue to track this https://github.com/Expensify/Expensify/issues/201190

danieldoglas commented 2 years ago

Adding full comments here since more people are following this issue than the other one:

@cead22 has applied the GetChat commands, which have a similar processing time that chatList had.

Logs:


748c4c68ca8865fb-SJC | virt2.sjc | 2022-09-11 00:31:08 126 | dbarrett@expensify.com | Auth - AuthRequest #16 ~~ command: 'GetChats' jsonCode: '200' size: '743,242' duration: '1,422ms' host: 'db1.sjc'
-- | -- | -- | -- | --

[748c4c68ca8865fb-SJC](https://www.expensify.com/_devportal/tools/logSearch/#query=request_id:(%22748c4c68ca8865fb-SJC%22)+AND+timestamp:[2022-09-10T23:31:08.126Z+TO+2022-09-11T01:31:08.126Z]&index=logs_expensify-018512)    virt2.sjc   2022-09-11 00:31:08 126 [dbarrett@expensify.com](https://www.expensify.com/_devportal/tools/logSearch/#query=email:(%22dbarrett@expensify.com%22)+AND+timestamp:[2022-09-10T23:31:08.126Z+TO+2022-09-11T01:31:08.126Z]&index=logs_expensify-018512)  Auth - AuthRequest #16 ~~ command: 'GetChats' jsonCode: '200' size: '743,242' duration: '1,422ms' host: 'db1.sjc'

The fact we've removed the chaining API calls that existed in this flow (chatList > reportStuff > IOU), we're now seeing much better performance in the app. Thanks @cead22!

Also, considering we have changed the way we deal with our requests (now we receive everything from Pusher, from an Offline first perspective), those 2 seconds do not feel like a lot of time while using the app.

I still think 1.5s is a lot of time in the database query. We had several discussions here about how to make this better, but I don't think this is a priority anymore, considering the impact to the final user.

Closing this for now!