Closed fernandomaura97 closed 4 days ago
There's a lot to unpack here and queeing theory is not my specialty so I may need a bit more time to fully understand where the bottleneck is.
However, I do have one immediate question that may help pinpoint the problem: if the simulated time remains the same, are there more scheduled calls in the second version of deque_schedule_service()
? If not, then I think this is rather a problem of the implementation of the loops rather than anything to do with the scheduler. If yes, are you able to estimate (rough order of magnitude only) the number of scheduler calls per 1s simulated in both variants?
Here are some answers to your questions and some extra random pieces of information that may help:
PoissonSource
can still happen in parallel.println!
is indeed terribly slow, but I don't see any explicit call to println!
in the above code. The tracing
crate may help here; at the very least, you could filter what gets printed and see if it makes a difference.I see unnecessary clone
calls (cloning should be avoided if you could just move the value), in particular the following seems suspicious for me:
self.output_port.send(self.aux_ampdu_serviced.clone()).await;
self.aux_ampdu_serviced.reset();
It seems that you are unnecessary cloning the structure containing a vector of other structures on every call.
Also you are allocating memory on every call: let mut packets_to_remove = Vec::new();
(and then deallocating it). For filtering I would use retain
method (and closure with side effects to collect data).
This assignment service_duration = Duration::from_secs_f64(resulting_delays.service_delay);
seems to be suspicious, shouldn't it have +=
?
Anyway, it seems to me that the performance issue is not related to self-scheduling, to be sure I would execute the appropriate code outside of the simulator and check how much time it takes.
Thanks for the early response! I tried tracing the code in different ways, and now by comparing both C++ and rust versions by adding simulation metrics, it seems I did an oopsie by misconfiguring the source, so it sent packets x1000 times more frequently than it should when compared to c++, definitely explaining the "performance decrease". Sorry about the code dump, as it is a bit convoluted and I deleted debug prints for a bit of brevity.
Right now it seems C++ and the asynchronix implementation have similar performance for same scenarios, so not complaining anymore :) Thank you so much for the feedback, I will take some points into account and re-check my code, I'm looking forward to build a nice DES for networking for my future work :D
No problem, you are welcome :)
Yes, it's worth looking as well into the potential issues highlighted by @jauhien (the =
vs +=
does look suspicious to me too).
Good luck with your project!
Hi, I found this project recently and was looking to translate a C++ discrete event simulator into memory-safe rust via this library, and I eventually ran into some performance issues when adding functionality to my model. Self-scheduling functions are widely used in the original C++ project, so I expected them to work similarly in here in terms of performance.
The structure of my project is the following, replicating a M/M/1/K queueing system where I eventually added aggregation of
MpduPacket
into a single Ampdu block that can hold 64 packets.The way the PoissonSource generates packets is by generating a first packet with exponentially distributed length and selecting the time of next generation according to a random distribution.
The queue has an input function that schedules service and adds packets to the
VecDeque<MpduPacket>
. It also has a self-scheduling function to deque and schedule packets with time proportional to their length. This version has no aggregation and just sends the MpduPacket to the next block, which works great and fast (1s of real time = 100s in simtime):Where I found some problems is when I wanted to add functionality to the
QueueModule
, in particular traversing the queue to putMpduPackets
together into anAmpduPacket
holding a max of 64MpduPacket
s inside a Vec. This has a severe performance hit on the system (1s of real time = 1s in simtime), making it unfeasable to use this library for some scenarios I was doing in my C++ DES.I tried separating the service into a separate function, and different ways of traversing the queue in order to improve performance without success on improving performance. I couldn't find a good reason for this, although I acknowledge I'm a begginer at async and might be blocking the program for some periods of time (f.e. when using
println!
) The problematic change is when I add the following functionality into thedeque_schedule_service
function, and make it deliver the AmpduPacket dequed in the last iteration:My main concerns at the moment are: