adiom-data / dsync

Database synchronization tool
https://adiom.io
GNU Affero General Public License v3.0
4 stars 1 forks source link

Best effort attempt to batch writes. #69

Closed adiom-mark closed 4 days ago

adiom-mark commented 1 week ago

Attempts to batch the writes in a best effort manner. It will break up the batches if the assumptions don't hold.

  1. Parallel Writer: Will attempt to batch up to the configured max, but break batches up if the db/collections are different or if an insert batch type is seen. Will break a batch upon encountering a barrier and also if the channel is empty.

  2. ProcessDataMessages: Assumes same db/collection. Breaks batches up if it sees an insert batch type (does not assume it is at the end, also does not assume there are not more than the max number of configured messages). Also, keeps track of the latest update for a particular document in the same batch so it can keep only the latest update. Falls back to original code for processing an insert batch type.

Q: Does it make sense to break the batch upon encountering a barrier or only if it is the "last" one or something else?

adiom-mark commented 6 days ago

Hmm I typically wouldn't document this anymore than the surrounding comments and just reading the code / potentially looking up the description this PR. Will keep it in mind in the future when there are more substantial changes though.

adiom-mark commented 5 days ago

just noting that not ready to submit:

  1. need to eliminate the busy wait (probably by factoring + nesting a select, let me know if there are better ideas)
  2. going to focus on getting some better testability first and revisit this after that