Closed joft-mle closed 5 months ago
Hi,
Thank you for your interest in using this repository. I am happy to help you as much as possible.
The original figure (MsgComletionSlowdown_W5_load-80p.png) was created using a former version of the Homa implementation. This former version is tagged as homa-v1.0.
There have been bug fixes since then and currently, we have homa-v1.1. The changes between the two versions are displayed here. The changes are mainly related to the retransmission logic of the protocol.
I don't expect the retransmission logic to affect the performance in the reproduction simulation because the original Homa paper was evaluated without a retransmission logic in the OMNet++ simulator. To create the same effect, one would need to disable the retransmission logic of Homa in NS3 or use very large timeout limits. Could you share the command you used to reproduce the figure?
Regards,
Hi @serhatarslan-hub ,
thank you for your reply.
Regarding the command(s) to produce the figure I posted:
$ outputs/homa-paper-reproduction/run-parallel-sim.sh 4 0.3
run-parallel-sim.sh
uses --disableRtx
, so I assume that retransmissions are actually disabled by effectively using those huge timeout values. Other than that, just the executable name needed adjustment (scratch/HomaL4Protocol-paper-reproduction
)..tr
files; one set for load 50% and one set for load 80%.MsgTraces-SlowdownAnalysis.ipynb
to let it generate the posted figure MsgComletionSlowdown_W5_load-80p.png
. I just commented the "pfabric" plotting, since the initial goal is to compare with your results. Also, I changed the captions to say "Pxx" instead of "xx%".I also repeated the above with a modified run-parallel-sim.sh
: not including the --disableRtx
argument. So that should then mean, that timeouts for retransmissions are NOT set to large values - and can actually be carried out. This time I realized my previous difference in duration and used 0.5 seconds. Not surprisingly (?) the slowdown plot for 80% load, with retransmissions not disabled, looks rather similar to the one I posted initially (with retransmissions disabled):
The fact that there is no big difference between my 2 plots, would kind of match the exception of yours, that actually doing retransmission should not affect performance, right? I think, I never saw the .ipynb
code report any "Number of uncompleted messages" greater than 0. So no reason for retransmissions, anyway? Hmmm, no, I think this statement/question makes no sense, since, if retransmissions do occur, successfully, then an affected request would have been completed - successfully.
I guess, to be really sure, I'll re-run a simulation w/ --disableRtx
(unmodified run-parallel-sim.sh
) and 0.5 seconds of duration.
However, at least regarding the beginning of any of these simulations, I got the impression that there is no difference when comparing the .tr
files of 2 runs of the same type of simulation - that there is no real random component involved, which would cause a different sequence of events - at least of course given the same seed values (via --simIdx=
). I'm not saying that there has to, but I'm aware of this being possible, even when using the same seed - depends on the model.
Yes it looks like the only difference between your simulations and the presented results in the repo is the simulation durations. Unfortunately, I don't remember the durations I used, but 0.5 seconds per simulation sounds about right. Note that the MsgTraces-SlowdownAnalysis.ipynb
has a variable called saturationTime
which ignores messages that started before this time. This is used to measure the performance of the messages that were active after Homa stabilized in the network. The default value for this is 0.1 seconds (the simulation starts at t=3 and all the messages started before 3.1 are ignored for performance measurements). Since your simulations are 0.3 seconds long, you are only considering 0.2 seconds worth of traffic in your simulations. This might be a factor for the difference.
Please let me know if things change when you run homa-v1.0 with 4 parallel simulations that are 0.5 seconds each (RTX disabled).
Hi @serhatarslan-hub,
the 4 simulations in parallel, using tag homa-v1.0 (commit 5308948) for network load 80% with --disableRtx
for 0.5 seconds resulted in the following slowdown plot:
To me, the results "optically" look rather identical to the resulting slowdown plot from the 4 simulations in parallel using commit 3cac711 (essentially homa-v1.1) for network load 80% with --disableRtx
for (now also) 0.5 seconds:
And doing an md5sum
on the .tr
files indeed says, that the sequence of events is 100% completely identical:
df97b493e816292853261aa3d587df1d my-4-parallel-sims-noRtx_g3cac7119bfdc/MsgTraces_W5_load-80p_0.tr
318cb9eed7aec00826fa863d4d004910 my-4-parallel-sims-noRtx_g3cac7119bfdc/MsgTraces_W5_load-80p_1.tr
e6500a28570fecb403d0fe74902cd152 my-4-parallel-sims-noRtx_g3cac7119bfdc/MsgTraces_W5_load-80p_2.tr
204c8ed94255cb912fbcbd2578fe10d5 my-4-parallel-sims-noRtx_g3cac7119bfdc/MsgTraces_W5_load-80p_3.tr
df97b493e816292853261aa3d587df1d my-4-parallel-sims-noRtx_homa-v1.0/MsgTraces_W5_load-80p_0.tr
318cb9eed7aec00826fa863d4d004910 my-4-parallel-sims-noRtx_homa-v1.0/MsgTraces_W5_load-80p_1.tr
e6500a28570fecb403d0fe74902cd152 my-4-parallel-sims-noRtx_homa-v1.0/MsgTraces_W5_load-80p_2.tr
204c8ed94255cb912fbcbd2578fe10d5 my-4-parallel-sims-noRtx_homa-v1.0/MsgTraces_W5_load-80p_3.tr
I also checked the .tr
files and plots for 50% load ... same story there - homa-v1.1 and homa-v1.0 results are identical - for a simulated duration of 0.5 seconds.
I don't know what I am missing.
I think I figured out the issue here.
Take a look at the figure for the total number of active messages throughout the simulation I run to obtain the results above.
The x-axis is the time. Note that the experiment continues until 3.5 seconds (starts at t=3 seconds). The messages/flows complete after that. The same figure reveals that the number of active messages does not completely saturate in 3.5 seconds, so I would recommend simulating longer, i.e., ~3-5 seconds, to measure the saturated performance.
Let me know if this helps.
Hi @serhatarslan-hub ,
for reference and completeness, here are the two graphs generated by MsgTraces-ActiveMsgCntAnalysis.ipynb from my 4 simulations in parallel using commit 5308948 (homa-v1.0) for network load 80% with --disableRtx
and for a --duration
of 0.5 seconds (title modified not include network load in percent):
As can be seen the saturation behavior and behavior in general is also different, but not "extremely" different - compared with the graph, you mentioned.
I agree, that saturation is not reached after 0.1 seconds of MsgGeneratorApp action - neither nor in the graph, you mentioned nor in mine. And I agree that it thus makes sense to try and run the simulation for a longer period of time.
However I still do not understand how this all can explain the clear difference between your slowdown graph and mine, given the similarity of the TotNActiveMsgs graphs, same --duration
(0.5 seconds) and assuming the graphs in the repository come from the same raw data, resulting from executing the same code?
Do you think, there is a chance that the included MsgComletionSlowdown graph has resulted from different raw data, than the TotNActiveMsgs graph? The .png
metadata just suggests that they have been generated using the same matplotlib version.
I understand your concern. Unfortunately, I cannot quite remember the exact simulation duration I used to generate the slowdown figures. Yes, the commit names and default scripts suggest 0.5 seconds. However, we cannot find another reason for the difference. I vaguely remember a discussion I had with my teammates about how duration can change the simulation results. In fact, this was why I created the "number of active messages" figure in the first place.
Hi @joft-mle and @serhatarslan-hub,
I also ran into the same problem about a month ago, but I saw this issue only now. I was actually able to figure out the problem and reproduce the graphs included in the repository.
The problem is how the priorities of incoming packets are retrieved in the homa queue-disc.
The current implementation looks for the SocketIpTosTag
which should contain the priority of the message, set by the homa protocol. The problem is, that no packet contains this tag when it arrives at the queue-disc. When the packet is send, it goes through the ipv4
stack and the Send
function of the Ipv4L3Protocol
removes the SocketIpTosTag
(if there is one) and sets the TOS
field in the ipv4
header accordingly.
To fix this we have to use the TOS
field of the ipv4
header instead of the SocketIpTosTag
in the homa queue-disc:
uint8_t priority = 0;
auto ipv4_item = DynamicCast<Ipv4QueueDiscItem>(item);
if (ipv4_item)
{
priority = ipv4_item->GetHeader().GetTos();
}
I hope this helps.
Thank you @marvin71 for sharing your fix. It is certainly great to see a community around this project. Would you be willing to send a Pull Request for your fix?
Yes, I just opened a Pull Request (#8) containing the fix.
Maybe for context, I am currently trying to explore the Homa transport protocol using SimBricks. In particular, I am looking into comparing the behavior for the ns-3 Homa implementation to the Linux implementation in SimBricks.
Thank you @marvin71 once again. I have merged the PR. Also, it is exciting that you are comparing the behaviors of ns-3 and Linux implementations. Please feel free to share your findings with us.
@joft-mle would you be able to re-run your experiments and see if the issue has been solved? I believe we can close this GitHub issue when you confirm.
@joft-mle would you be able to re-run your experiments and see if the issue has been solved? I believe we can close this GitHub issue when you confirm.
Indeed, after applying the change @marvin71 provided, a first re-run of the default experiment (effective duration of 0.5, assumption of 0.1 seconds of saturation, with --disableRtx
, 4 independent runs in parallel) resulted in almost identical graphs, compared to what's in the repository.
For reference (my usual, slightly modified (no "pFabric" numbers)) graphs for 80% load:
The same is true for the other case, 50% load.
Thank you very much, @marvin71 for debugging and @serhatarslan-hub for merging!
Hi @serhatarslan-hub,
I would like to ask which version of the code in this repository has been used to produce the results shown in MsgComletionSlowdown_W5_load-80p.png and the like?
Following the instructions to build and run scratch/HomaL4Protocol-paper-reproduction.cc results in substantially different graphs for us - especially regarding the "slowdown". The graphs in this repo show a slowdown, over message size, which more or less closely "follows" the included OMNet++ course of things. When trying to reproduce this (80% load, P99), we more or less get a constant slowdown of about 7-8 for all message sizes until shortly past ~ 4.866M. For larger message sizes, it then "follows" the OMNet++ data:
However, the characteristic "jitter" around message size 150k does exist, for example.
Any idea what we are missing?