Open joft-mle opened 2 weeks ago
Thanks for the catch! The original Homa paper and the OMNET++ simulator doesn't dive deep into the retransmission logic of the protocol. Hence the implementation of the retransmission logic in this ns3 version is not thoroughly tested. You can refer to the linux kernel module implementation of the protocol for a complete view of how the protocol behaves in such situations.
Hi @serhatarslan-hub,
just for the sake of completeness and as a follow-up to the solution by @marvin71 für issue #7, on the side, we also ran the default tests case (effective duration of 0.5, assumption of 0.1 seconds of saturation, 4 independent runs in parallel, e75c4de489eb) without the
--disableRtx
option.To our surprise -- and that's the reason for opening this issue -- in contrast to previous, older runs without
--disableRtx
, this time both parts (load 0.8 and load 0.5) simply did not finish after the usual amount of run time. Instead the 4 ns-3 instances just sat there at 100% CPU each, doing no output to stdout and the trace files anymore.For example, the end of the output for run w/ load 0.5 looks like the following lines (2 excerpts, output of all 4 instance might be mixed!) . The run w/ load 0.8 shows similar effects.
So, it looks like the retransmission logic causes some kind of asynchronism between sender(s) and receivers. For the above, MsgTraces-SlowdownAnalysis.ipynb, reports 6 incomplete messages. So not too much is "missing" to complete the simulation. According to time stamps, simulation time did advance up to ~3.6s, which is more or less what is expected for the test parameters, of course.