CiwPython / Ciw

Ciw is a simulation library for open queueing networks.
http://ciw.readthedocs.io
MIT License
153 stars 42 forks source link

Deterministic arrival and service queue not producing expected number of events #260

Open matthew-machado opened 1 month ago

matthew-machado commented 1 month ago

Hello there πŸ‘‹

I'm trying to understand how ciw handles deterministic arrivals and services in a simple queue simulation. Specifically, I'm seeing fewer arrivals than expected, and I'd appreciate some guidance on whether this is a bug or expected behavior. I am working on simulating a very simple queue where:

Since arrivals are deterministic, I would like to see the theoretical number of arrivals match what ciw returns after the simulation, and there appears to be a discrepancy (which may totally be user error on my side πŸ˜… ). Curious if anyone can help out and opine on some of the issues I'm having.

In this dummy experiment, I would expect 24 arrivals in the system. However, I'm producing 22 arrivals in this simulation:

N = ciw.create_network(
    arrival_distributions=[ciw.dists.Deterministic(value=1)], 
    service_distributions=[ciw.dists.Deterministic(value=1)],
    number_of_servers=[1]
)

ciw.seed(1)
Q = ciw.Simulation(N)

sim_time = 24
Q.simulate_until_max_time(sim_time)
recs = pd.DataFrame(Q.get_all_records())

# there is a gap here.. I think some of this might be clipping at the upper limit (but not sure why there aren't 23 tickets then..)
print('Theoretical Arrivals: ', sim_time)
print('Actual Arrivals: ', recs.shape[0])

Interestingly, there also appears to be some influence of the number of servers. Suppose I have a the following shift schedule, which maps to EST business hours in UTC times. I intend any arrivals outside of business hours to join the waiting queue and then be serviceable in business hours. This produces 20 arrivals and similarly I would expect 24 arrivals (1 per hour for 24 hours).

# create network
N = ciw.create_network(
    arrival_distributions=[ciw.dists.Deterministic(value=1)],
    service_distributions=[ciw.dists.Deterministic(value=1)],
    number_of_servers=[
        ciw.Schedule(
            numbers_of_servers=[0, 5, 0],
            shift_end_dates=[13, 20, 24]
        )],
)

# calculate metrics
ciw.seed(2)
Q = ciw.Simulation(N)

sim_time = 24
Q.simulate_until_max_time(sim_time)
recs = pd.DataFrame(Q.get_all_records())

# there is more of a gap here, not sure why servers would impact the arrivals process (it shouldn't)
print('Theoretical Arrivals: ', sim_time)
print('Actual Arrivals: ', recs.shape[0])

Finally, if I introduce customer classes, this effect is compounded even further. Suppose I have an experiment where:

For this simulation, I would expect num_customer * 24 arrivals (240 in this case), but am seeing 22 arrivals.

num_customers = 10
arrivals_distributions = {f'customer_{i}': [ciw.dists.Deterministic(value=1)] for i in range(0, num_customers)}
services_distributions = {f'customer_{i}': [ciw.dists.Deterministic(value=1)] for i in range(0, num_customers)}

# create network
N = ciw.create_network(
    arrival_distributions=arrivals_distributions,
    service_distributions=services_distributions,
    number_of_servers=[1]
)

# calculate metrics
ciw.seed(2)
Q = ciw.Simulation(N)

sim_time = 24
Q.simulate_until_max_time(sim_time)
recs = pd.DataFrame(Q.get_all_records())
print('Theoretical Arrivals: ', num_customers * sim_time)
print('Actual arrivals: ', recs.shape[0])

# tickets per customer
recs.groupby('customer_class').agg(num_tickets=('id_number', 'count'))

Any advice would be really appreciated! Thanks all

galenseilis commented 1 month ago

Hi! I'm just going to address the first example (due to time constraints).

@geraintpalmer would be the main person to weigh in, but I'll add my comments here. Correct me where I am wrong.

The first thing that came to my mind when I saw the example is that Ciw only shows 'completed' records. That does not mean there are not individuals in the systems.

Added to your example:

>>> print(Q.nodes[1].individuals)
[[Individual 23]]
>>> print(Q.nodes[1].individuals[0][0].arrival_date)
23

To get that 24th individual to show up in the records we need to run the simulation long enough for them to actually complete their service. This would suggest that you should simulate until t=25, but I think you are right that there is a boundary effect. Simulating 'slightly longer' (e.g. 25.01) solves that issue.

import ciw
import pandas as pd

N = ciw.create_network(
    arrival_distributions=[ciw.dists.Deterministic(value=1)], 
    service_distributions=[ciw.dists.Deterministic(value=1)],
    number_of_servers=[1]
)

ciw.seed(1)
Q = ciw.Simulation(N)

sim_time = 25.01
Q.simulate_until_max_time(sim_time)
recs = pd.DataFrame(Q.get_all_records())

print('Theoretical Arrivals: ', 24)
print('Actual Arrivals: ', recs.shape[0])
matthew-machado commented 1 month ago

Thanks @galenseilis! Appreciate the insights here

It sounds like Q.get_all_records() isn't doing exactly what I'd expect. If we stop the clock at 24 hours, but there are still individuals in the service node, Q.get_all_records() only returns individuals who hit the exit node. Is there a way to return all records, not just the records of individuals who hit the departure node?

If we look at making metrics like queue lengths, wait times, etc. it seems like it's being censored if we only consider what's in the departure node.

galenseilis commented 1 month ago

Thanks @galenseilis! Appreciate the insights here

It sounds like Q.get_all_records() isn't doing exactly what I'd expect. If we stop the clock at 24 hours, but there are still individuals in the service node, Q.get_all_records() only returns individuals who hit the exit node. Is there a way to return all records, not just the records of individuals who hit the departure node?

If we look at making metrics like queue lengths, wait times, etc. it seems like it's being censored if we only consider what's in the departure node.

Sorry for being terse, but short on time. Briefly:

Q.get_all_records() does not return individuals. It returns records, which I expect are being returned at end of service.

I agree, this is censoring in the statistical sense of the word. What I've done to collect these incomplete records is loop over each non-exit node, and for each node loop over the individuals in that node to collect some of their attributes.

You can take a look at what is in hciw.results.summarize_individuals for inspiration. It should show you where you need to look to access individuals as there are in the current state of the simulation. I think you can pip install hciw, but I should caution that I am not consistently maintaining HCiw. Perhaps best to take what you can learn from the example and adapt to your use case.

IMO having all the records, including incomplete ones, should be the default. Maybe that will get put in as a feature at some point.

geraintpalmer commented 1 month ago

Hi @matthew-machado

Q.get_all_individuals() will get all the individuals for you, even if they are still in service.

Ciw isn't necessariliy censoring unfinished individuals. It's just that a data record is only created once service is complete.

The gap you are seeing is due to two reasons: 1) The first individual doesn't arrive until time 1. So at say time 24.5, there have been 24 customers arrived to the system, but only 23 of them have completed service. 2) You have three simultaneous events going on at $t=24$:

I hope that explains what you are seeing.

matthew-machado commented 1 month ago

thanks a ton @galenseilis and @geraintpalmer! this is helpful for me to see that the arrivals were generated, but weren't making it to the recs table because there wasn't a completed record for these individuals.

You can take a look at what is in hciw.results.summarize_individuals for inspiration.

Thanks for sharing! Had no idea about this project, excited to see what you've built here

IMO having all the records, including incomplete ones, should be the default. Maybe that will get put in as a feature at some point.

Would love it if this was the default! I think it's helpful to make sure we're not biasing our metrics. These examples are extreme, but considering my third example, if we measure the queue length at the exit node by using Q.get_all_records(), it vastly underestimates the waiting dynamics by considering only 20 tickets instead of 240 which don't have a "record" because they're not completed yet.

And perhaps a bit pedantic, but I think the naming of the method is misleading and obscures the fact that there's censoring here (but I see here you do call out that it's just reporting on the exit node). Perhaps it can be something like Q.get_all_completed_records() or something of that flavor?