ZacAttack commented 6 months ago

I'm trying to model a system which uses a Kafka producer.

A Kafka producer works with a queue size and a linger time. A Kafka producer will send messages to a broker if one of two criteria are met. Either there are messages in queue on the producer and a linger time has passed, or, the number of messages in the queue has exceeded some amount.

Thinking through this in ciw terms, it seems like a slotted service that is capacitated would give me the behavior of linger time. However, I need to forgo this behavior if the queue size gets large enough. I'm not sure I'm seeing a clear way from the docs to simulate either the queue size behavior or the dual behavior of the serving node.

Welcome to any suggestions.

galenseilis commented 6 months ago

A few options come to mind for batched servicing.

Option 1: Multiple servers

In a sense what batch processing means is that there are multiple abstract servers, which may at times mean multiple servers in Ciw being used to represent a single thing in real life that can complete multiple jobs simultaneously. This allows having multiple jobs being completed simultaneously. In Ciw this can be done by passing an argument to the number_of_servers in the ciw.create_network

Option 2: Sequential distribution

The first approach involves using ciw.dists.Sequential. The idea here is to precompute the service times ahead of time in sequence.

A sequence of seq = [1, 0, 0, 0, 3, 0] would mean that 4 messages are serviced at t=1, then 2 jobs are processed at t=4. This will cycle, so if you do not want any cycling to occur then you must make the sequence longer in time than the simulation time.

Option 3: Custom distribution

If you need something like option 2 but you would rather sample the service times during the simulation than precompute them, then you can design your own custom distribution as per the docs.

galenseilis commented 6 months ago

I might have to think more service lingering time.

One simple option is to include the lingering time into the Ciw node's service time. That would be fine if you're trying to get the rate of flow of messages right, but wouldn't help if you wanted to estimate something like the difference between the service time and lingering time.

Otherwise I think you'll have to look at patching where you set the server to be off duty for some amount of time right after the completion of a service.

geraintpalmer commented 2 months ago

Hi @ZacAttack I think you can do this with custom service disciplines.

Please see the test case test_custom_service_discipline on line 1243 of the file Ciw/ciw/tests/test_simulation.py for an example of this: https://github.com/CiwPython/Ciw/blob/22536cc200ed80d97c0e7b6d3e231ea3996eb144/ciw/tests/test_simulation.py#L1243C9-L1243C39

It should be simply to add another check that says, if number of customers above a threshold, just take the customer at the head of the queue, regardless of their current waiting time.

One drawback of this method is that customers are only chosen when the choose_next_customer methods is called. So this is really a "linger for at least" methods. e.g. "linger for at least 3 time units, and next time choose_next_customer is called, take the first customer who has waited for at least 3 time units". So some people might wait longer than that. The choose_next_customer methods is called when a customer arrives, leaves, or a shift change occurs.

CiwPython / Ciw

[Feature Request] Service linger time and batch serving #239

Option 1: Multiple servers

Option 2: Sequential distribution

Option 3: Custom distribution