pollMessages() in PostgresQueueDAO does not break when popMessages() returns empty message list.

Describe the bug In pollMessages implementation we need to address two issues when messagesSlices object is empty.

The loop continues when there are no eligible messages in the queue. This leads to unnecessary querying of the database until the timeout is reached, and the sleep time of 100ms escalates the problem. The mix of these two together causes high queries per second at scale.
When the number of messages is less than the count specified, the thread blocks and holds onto the previous messages in the memory until timeout happens or the count is reached.

Please share your thoughts on this. If we agree, I can raise a Merge Request for the same.

Details Conductor version: any Persistence implementation: Postgres Queue implementation: Postgres

To Reproduce Steps to reproduce the behavior:

Run conductor with Postgres as the persistence layer for Queue implementation.
Monitor QPS in Postgres using pg_stat_statements Note: Here we are only seeing the effect of pollMessages called from the WorkflowReconciler where the timeout is 2000ms and count depends on the CPU core.

Expected behavior

Return the messages that are present instead of waiting and retrying till the timeout or count.

Screenshots Just started the service and execute pg_stat_statements. Below, is the number of queries executed per minute from one instance just for WorkflowReconciler. The QSP increases exponentially if there are multiple instances and if the workers poll for activity.

Screenshot 2023-06-20 at 3 55 16 PM

Note: Here we are only seeing the effect of pollMessages called from the WorkflowReconciler. There are other places where pollMessage is being invoked.

Additional context The MySQL implementation of the poll does not have this while(true) loop.

Netflix / conductor-community

pollMessages() in PostgresQueueDAO does not break when popMessages() returns empty message list. #236