We've seen the pubsub streaming transformer can occasionally produce duplicates, especially around startup/shutdown.
Previously, we were using the MoreExecutors.directExecutor to execute Subscriber callbacks. It was a trick we invented so that we could safely turn off FlowControl without PubSub sending us too many events and causing us to run out of memory.
But, because we block the thread on those callbacks, it meant the Subscriber was not able to do admin tasks like acking and managing ack lease time.
We can continue to block the subscriber, but we need to make sure the subscriber still has a thread available for the admin tasks. I'm hoping that if the subscriber has this dedicated thread for acking and ack extensions, then we should get fewer duplicates in the warehouse.
We've seen the pubsub streaming transformer can occasionally produce duplicates, especially around startup/shutdown.
Previously, we were using the
MoreExecutors.directExecutor
to execute Subscriber callbacks. It was a trick we invented so that we could safely turn off FlowControl without PubSub sending us too many events and causing us to run out of memory.But, because we block the thread on those callbacks, it meant the Subscriber was not able to do admin tasks like acking and managing ack lease time.
We can continue to block the subscriber, but we need to make sure the subscriber still has a thread available for the admin tasks. I'm hoping that if the subscriber has this dedicated thread for acking and ack extensions, then we should get fewer duplicates in the warehouse.