Closed Mandaryn closed 7 years ago
I think that's a concern of bookingsync-api
auto pagination then https://github.com/BookingSync/bookingsync-api/blob/e82d15815871f6c6017700ad9d5322796c1db61b/lib/bookingsync/api/client.rb#L186
The question is then: shall we run this in a transaction or not. Having a transaction would ensure that we save all or nothing, which helps to keep a sane state, however that would still have the same memory issue.
@Mandaryn What would be your approach to fetch and save in chunk while making sure we don't stay in a middle state which could cause issues with updated_since for example?
i think scoping everything in transaction while dealing with each page separately would allow for better memory usage, each page would be handle separately in ruby and get GC if necessary. Only the transaction would be large. The only possible drawback i see it that the sync transaction would be longer now, (spanning multiple requests to core instead of just single big write), and could cause some deadlock issues more often (but i think transactions in rails don't lock tables for inserts)
spanning multiple requests to core instead of just single big write
It's already doing multiple requests so that won't be longer or anything, just preparing the transaction in small blocks.
About the deadlocks, I don't really know the impact from one solution to the other here. @Azdaroth do you know more about this?
I wouldn't really go with making transactions longer, there are already problems with that and deadlocks happening quite often. Generally speaking, long transactions are not a good thing. Maybe we could run one transaction for each page separately? And the synced_all_at
would be assigned at the end after fetching data from all pages and persisting them.
@ZenCocoon Basically, the longer transaction is, the higher possibility of the deadlock. On the other hand, we could prevent syncing same models at the same time, which would make deadlock not really possible to happen, at least the ones caused by syncing which are 100% issues with synced
in bsa-apps I think.
ok, then 1 transaction per page seems wiser. Little harder to handle errors in between pages but seems most optimized to lower deadlocks and memory usage.
wrapping all in transaction should not be a problem, rails transactions don't lock tables as far as i know. would probably only fail if 2 syncs for the same are done in parallel, but that would fail anyway now
The 2 syncs for the same scope are exactly the problem now. Sounds like we will need this one https://github.com/BookingSync/synced/issues/29
I reach a point when I fetch collections of thousands entries from one endpoint for account. With current code https://github.com/BookingSync/synced/blob/master/lib/synced/strategies/full.rb#L172, this is all now firstly fetched to memory and then saved to database, which is bound to cause troubles.
We need to be able to save each page separately to database, to avoid bloating memory usage sky high (and running out of memory for larger endpoints)
@Azdaroth @wijet @Fredar @p55 @ZenCocoon wdyt?