PlatformLab / Homa

Low-Latency Data Center Network Transport
ISC License
200 stars 27 forks source link

Some problems about this implementation of Homa #2

Closed btlcmr0702 closed 5 years ago

btlcmr0702 commented 6 years ago

Hi, I read the paper in sigcomm 2018 about Homa. And I also check the codes roughly in this repo. I have many problems about this implementation. For example, in Homa/src/Protocol.h, there is nothing about priority in Grant or Data packet Header. I also check the Homa/src/Sender.cc , I find in Sender::sendMessage function, the packets are sent without any consideration about priority (the priority is just set to 0 in line 119). Also in Homa/src/Receiver.cc, it seems there is nothing about processing priority when send GRANT packets.

So Is this implementation far away about what described in the paper sigcomm 2018 or I check wrong places ? Does this implementation is the final version used in the paper?

Expect your reply Thanks!

yilongli commented 6 years ago

This repository contains a new userspace standalone implementation of Homa that is currently work-in-progress. The implementation used in the SIGCOMM'18 paper is embedded inside RAMCloud: https://github.com/PlatformLab/RAMCloud/blob/master/src/HomaTransport.h. If you are interested in reproducing the evaluation results reported in the paper, here are the instructions: https://github.com/PlatformLab/homa-paper-artifact.

behnamm commented 6 years ago

In addition to the implementation of Homa in RAMCloud that Yilong mentioned, there's also a separate repository for Homa simulation code. The repository is located here https://github.com/PlatformLab/HomaSimulation/ and the code in built on top of OMNeT++ simulator. The instructions on how to install the simulation code is provided here: https://github.com/PlatformLab/HomaSimulation/tree/omnet_simulations/RpcTransportDesign/OMNeT%2B%2BSimulation

Ping me if you are interested in running the network simulation code and reproducing the simulation results in the paper.

btlcmr0702 commented 6 years ago

Hi, @behnamm Thanks for your reply. I also find a repo PlatformLab/HomaModule And it seems it is an another implementation of HOMA in Linux kernel. I wonder why there are three versions of implementation of HOMA (this repo, kernel version, embedded inside RAMCloud)?

behnamm commented 6 years ago

Hi @btlcmr0702 We first started with implementing Homa in RAMCloud code base because RAMCloud has a flexible transport management system that allowed us to easily plug Homa in. This way RAMCloud could be used as an application on top of Homa transport scheme and it simplified our initial development and performance debugging of Homa. Furthermore, with RAMCloud as the workload generator, we could easily evaluate Homa's performance against Infiniband and TCP since RAMCloud can be configured to use any of these transports.

However, the implementation of Homa in RAMCloud is tied to RAMCloud's RPC system and communication plumbing and it wont be usable for other applications. So we started two separate projects to implement Homa as standalone packages, that developers can use in their own applications:

1) The user-space (ie. Kernel bypass) implementation here https://github.com/PlatformLab/Homa , depends on intel DPDK technology and it requires the network admins to enable priorities (ie. QoS levels) in the network switches. This project is aimed for those developers who want lowest possible latency and best performance that Homa can provide. These developers should be willing to bypass the kernel and, to some degrees, sacrifice the isolation, multi tenancy, and security that Linux Kernel provides and more importantly they should be willing to build/change their applications to conform to the API that Homa provides. But, in return, they get a blazingly fast transport and RPC system.

2) The Kernel implementation of Homa here: https://github.com/PlatformLab/HomaModule doesn't have any dependency on any third party technology other than it requires the network priorities to be enabled in the fabric and as the name suggests, it runs in the kernel. Our goal with this implementation, more than any thing, is ease of adoption for Homa in production datacenters; our hope is to have this implementation eventually be compliant with the Linux socket interface. Our prediction is that this implementation wont be as fast as the user-space implementation but still orders of magnitude faster than TCP and Infiniband and other implemented transports. We'd like to allow developers to enjoy most of the benefits that Homa provides without changing much in their applications; so using this implementation would be as easy as loading a kernel module in Linux and get all the benefits that Linux kernel provides.

That said, these two project are still ongoing work and we don't have any prediction when they will be complete.

Hope this helps.

btlcmr0702 commented 6 years ago

@behnamm Thanks a lot ! 👍 Now I know your experience for implementing HOMA finally. And another little question, in the paper section 4, you said

The RAMCloud implementation of Homa includes all of the features described in this paper except that it does not yet measure incoming message lengths on the fly (the priorities were precomputed based on knowledge of the benchmark workload).

So I wonder if I don't know the workload in advance, how should I set the priorities?

behnamm commented 6 years ago

@btlcmr0702 For workloads W3, W4, and W5 from Homa paper, we have pre-computed the priority cutoffs and they are set in this script: https://github.com/PlatformLab/RAMCloud/blob/master/benchmarks/homa/scripts/run_workload.sh

So, if your goal is to reproduce the results of the paper, then the correct priority cutoffs based on the Homa scheme is available in the script.

However, if you want to run Homa in RAMCloud with an arbitrary workload, then I'm afraid the implementation doesn't have the code to automatically measure the workload and set priorities; That's a current limitation of the Homa implementation in RAMCloud. At the moment, for arbitrary workloads with RAMCloud, you'll need to hand-compute the priority cutoffs and set them in the script. (Read the code in the script to get an idea how to add new workloads and set priority cutoffs).

That said, for the two ongoing standalone implementations I mentioned above, our plan is to implement the scheme for automatic priority cutoff computation and assignment. So hopefully, this wont be an issue with these new implementations.

Cheers,

btlcmr0702 commented 6 years ago

Hi, I have a question when I check the code for homa. In the function HomaTransport::dataPacketArrive of RAMCloud/src/HomaTransport.cc, I find each time when receiver receives a data packet, it will find the first active message that could use a GRANT and compute the priority to use for it. So how homa implements the overcommitment mechanism(in sigcomm paper) if it only grants the first active message at a time ? In the paper , it said

There is no way for a receiver to know whether a particular sender will respond to grants, so the only way to keep the downlink fully utilized is to overcommit: a receiver must grant to more than one sender at a time, even though its downlink can only support one of the transmissions at a time.

yilongli commented 6 years ago

@btlcmr0702 Each incoming message can have at most one RTT extra bytes that are granted but not yet received. Therefore, when the first active message has reached this limit, the grant will go to the next active message.

btlcmr0702 commented 6 years ago

@yilongli Oh I see. Thanks a lot!. And another little questions, how to set the overcommitment degree ? Just set it to the priority number of scheduled packets?

yilongli commented 6 years ago

@btlcmr0702 The degree of overcommitment can be larger than the number of priorities for scheduled packets. In general, you need to set the degree of overcommitment to be large enough so that Homa can sustain a high network load.

btlcmr0702 commented 6 years ago

@yilongli Thank for your reply. So the degree of overcommitment is a static value rather than a dynamic value ? I just need to adjust it to make sure the Homa can sustain a high bandwidth.

Except this, I find in your simulation , each grant can invite only one data packet. While in implementation, due to the effect of offset parameter in grant packet, each grant can invite several data packets. How should I decide the offset parameter in grant packet each time for different messages from client at receiver?

yilongli commented 6 years ago

@btlcmr0702 I think we have only experimented with static degree of overcommitment so far. I am not sure I understand your question about grant offset correctly. A receiver maintains the grant offsets for all its incoming messages. The fact that a grant packet may correspond to several data packet is merely an implementation optimization to reduce the number of outgoing grant packets. Does it answer your question?

btlcmr0702 commented 6 years ago

@yilongli Sorry to reply late. I ask this question mainly because I saw the definition of grant packets in paper. image Usually one "token"(grant) packet corresponds only one data packets in other works like pHost. So I wonder it is same in Homa or grant packets can correspond several packets. If so how to set this offset value , just set it to RTTbytes or one MTU bytes?

yilongli commented 6 years ago

@btlcmr0702 Yes, in theory, one grant packet corresponds to one data packet in Homa. For example, if the n-th grant for some incoming message has offset 10000 then the n+1-th grant will have offset 10000+MTU and the n+2-th grant will have offset 10000+2*MTU. However, in the implementation, we may collapse multiple grant packets for the same incoming message into one for efficiency. For example, if we are about to send out the 3 grant packets described above, it's sufficient to send only the last grant packet.

Update: the first grant packet has offset RTTbytes + MTU.

btlcmr0702 commented 6 years ago

@yilongli Hi, Sorry to bother you again, I still a little confused about grant packets. For example, the sender sends 6 unschedule data packets first (suppose the message is large enough), then will the receiver send 6 grant packets correspond to each data packets ? If so these 6 grant packets will be stored at sender, then sender can use them to send the next 6 scheduled packets ? Or the receiver just send one grant packets for the first scheduled data packets(7th) after 6 unscheduled packets. After the 7th scheduled data packets arrival, the receiver will send next grant packets.

yilongli commented 6 years ago

@btlcmr0702 Your former interpretation is correct. The receiver sends out the first grant packet when it receives the first data packet.

mustansarsaeed commented 5 years ago

Hi @behnamm , I have reserved the chassis as mentioned in homa artificat repository and ran the command as you mentioned https://github.com/PlatformLab/Homa/issues/2#issuecomment-433672774 , but it is not producing the results for any of workload same as paper. I am trying to produce the results for workload W3 but slowdown script failed to produce the graphs, please see the attached file slowdownImpl.pdf , I have also logged a issue on homa artifact repository. Can you please help me in this regard?