We need to design a method for training tasks of ML models in TrainDB. One of the reasonable solutions for this problem is using a message queue.
Message queues provide an asynchronous communications protocol, meaning that the sender (e.g. TrainDB main) and receiver (e.g. TrainDB-ML in K8s) of the message do not need to synchronously interact with the message queue at the same time.
A common use case for jobs is to process work from a work queue. In this scenario, some task creates a number of work items and publishes them to a work queue. A worker job can be run to process each work item until the work queue is empty.
This method will provide the following features:
Guaranteed Delivery - At-least-once delivery and most messages are delivered exactly once.
Long Polling - Consumers can wait until a message available in the queue to consume
Ack All Queue Messages - Any client can mark all the messages in a queue as discarded and will not be available anymore to consume
We could consider several patterns such as FIFO-based MQ pattern and publish-subscribe pattern, to solve the asynchronous training requests and responses.
feat: Work Queues
Priority-3
We need to design a method for training tasks of ML models in TrainDB. One of the reasonable solutions for this problem is using a message queue.
Message queues provide an asynchronous communications protocol, meaning that the sender (e.g. TrainDB main) and receiver (e.g. TrainDB-ML in K8s) of the message do not need to synchronously interact with the message queue at the same time. A common use case for jobs is to process work from a work queue. In this scenario, some task creates a number of work items and publishes them to a work queue. A worker job can be run to process each work item until the work queue is empty.
This method will provide the following features:
We could consider several patterns such as FIFO-based MQ pattern and publish-subscribe pattern, to solve the asynchronous training requests and responses.
References