Slow for bigger accounts

scottrobertson commented 7 years ago

Hey

There probably isnt anything that can be done on this end, but I am noting that it's painfully slow to add a new transaction on a YNAB account that has a lot of data. I am running https://github.com/scottrobertson/monzo-to-ynab which is hitting a 120 second timeout limit

rienafairefr commented 7 years ago

I know the painful slowness! I'm currently trying to rewrite some parts of the Client logic to solve that :-)

The issue is we have to download and sync all the data from the server each time in order to push a modification. I'm trying to have the library remember some data so that we don't have to re-download re-sync if we connect to an account that we already used previously (the data would be saved in a database, is it ok for your use case?)

rossdargan commented 7 years ago

Can the client be a long lived object (possibly kept in memory for months) if we just called sync every time before we added a transaction (a few times a day maybe)?

My guess is this would dramatically speed things up for us.

scottrobertson commented 7 years ago

The problem with using a database is that we are introducing dependencies. We need to keep this very easy to use and run.

We also need to ensure that if we store anything locally, that the library can continue to work on immutable, read only services like Heroku.

rienafairefr commented 7 years ago

I think I went too short in my explanation "download and sync all the data from the server each time in order to push a modification", this is not quite right ! What I wanted to say is that the client needs to download and incorporate all the data from the server when first created. Then it can push modifications. If the client is kept around in memory, you don't need to sync before pushing. Maybe the nynab server considers a too old client to be stale, but it's more about authentication (the authentication token does have an expiration date).

@scottrobertson Right now the client does have a dependency on a database, but it's internal (memory database sqlite:// in sqlalchemy). Maybe I'll write a recipe somewhere on the wiki of how we can "resurrect" a client, and push directly without downloading all the user data.

@rossdargan if you keep the client in memory it's ok, you dont need periodical sync. you can push anytime, the server will know from what server state you are working. Of course if there is a consistency error (for example if you push a transaction with a payee that has been deleted in the meantime), the push will fail. There shouldnt be any risk to introduce corruption in the server data

scottrobertson commented 7 years ago

@rienafairefr if the client being kept in memory prevents it from syncing each time, then we are good here. No need for you to do anything on your end :)

rienafairefr commented 7 years ago

An idea to reduce time spent with big accounts: In the sync method of the Client object, the library does:

self.catalogClient.sync()                          # relatively short, doesnt depend on the size of the user data
self.select_budget(self.budget_name)
self.budgetClient.sync()                      # potentially long, depends on the size of the user data

If you know that what you want to push doesnt depend on anything in the user data (not an existing payee, stuff like that), just omit the third step. The client should be perfectly capable of pushing. You'd need to add some data yourself (like the entities_account_id for a transaction), but if the budgetClient.sync is your bottleneck, maybe skipping it is feasible. If you need some user data but not all of it, it's also feasible to look into RootObjClient.sync (catalogClient and budgetClient are both RootObjClient), the possibly long call is update_from_sync_data . You could monkey patch RootObjClient.update_from_sync_data to omit certain fields in the fetched data. Depends on your use case and the bottlenecks I guess :-) And you might have some bugs with knowledge values out of sync or other fun stuff, but it might be worth trying.

scottrobertson commented 7 years ago

Thanks, @rienafairefr. So just to clarify the use case. We are sending transactions in real-time to YNAB from our bank. Monzo (the bank) send a webhook with all the data, and then we forward that onto YNAB.

Like the library, we are trying our best not to introduce dependencies. For example, we really don't want to introduce background jobs as it will make the app harder for people to use.

I will look into those ideas, but we need the latest Payee info at a bare minimum, so I don't think we will be able to do that.

rienafairefr commented 6 years ago

I just pushed a new beta version 0.6.0b, commit e52ce071abd149817bee585e84c90cd3ebf57e8c, it's dividing the time to merge entities by a factor of 3, maybe the time to do client.sync() is also decreased 👍 check it out

rienafairefr / pynYNAB

Slow for bigger accounts #37