bpatient78 / IETF121-HP-WAN-BoF

HP-WAN mailing list discussion
0 stars 0 forks source link

congestion control algorithms for HP-WAN #5

Open bpatient78 opened 3 months ago

bpatient78 commented 3 months ago
xiongquan1230 commented 3 months ago

Question 1: What is the special, novel, IETF-specific problem for congestion control algorithm in HP-WAN?

Some points expressed during discussion so far. a) BBR should be less susceptible to packet loss. Why not use, e.g., BBR, and window scaling? Do they even have a problem that needs solving? BBR is not specific for HP-WAN. https://mailarchive.ietf.org/arch/msg/hp-wan/RtycnJAW_nTq2CzuFSCLiOKYY44/ https://mailarchive.ietf.org/arch/msg/hp-wan/KMFuaiPyzViNQvRgEj-BFaJa2Bc/ https://mailarchive.ietf.org/arch/msg/hp-wan/780fDjI0iMOgttC8Jc_CwK3tUjc/

b)What is mostly needed here is the Congestion Control that would work over WAN. DCQCN would not tolerate the so long feedback loop. Would HPCC tolerate it? (I believe not). BBR is strictly out of HPC discussion.CC here is DCQCN/HPCC (DCTCP for old installations). https://mailarchive.ietf.org/arch/msg/hp-wan/eOz6gCqjzCb5DWQn_tgyxhIw6iQ/ https://mailarchive.ietf.org/arch/msg/hp-wan/cmDyEaaITIfCzeEkiOT-ukuSb9k/

c)IETF is very relevant for Enterprise WAN, Campus, or DC. IETF is not just for "the Internet".CC is by 70% dependent on CCA/host, but 30% is on AQM/Router side. It is a valid point for integrability. IETF job. The coordination is required between host and routers. https://mailarchive.ietf.org/arch/msg/hp-wan/DRBfmI956w_JHmT1c7ak6lzPVQw/ https://mailarchive.ietf.org/arch/msg/hp-wan/BKycAqPwZ56GxZRMnUtqYYuUeso/

xiongquan1230 commented 3 months ago

Question2: What is the applicability of RDMA techniques being used across the WAN to carry massive data, negating the need for TCP or UDP modifications and subsequent (L2/L3) overhead?

Some points expressed during discussion so far.

a)RoCE has InfiniBand encapsulated but it is just very expensive to refactor it. iWARP would be much better because it assumes normal TCP/IP (better tolerate the WAN reality). Emulating InfiniBand over the Internet is impossible. It is out of IETF scope https://mailarchive.ietf.org/arch/msg/hp-wan/zWXzv7Tx64coRnKOUyT8zSOvQnA/ from Vasilenko Eduard https://mailarchive.ietf.org/arch/msg/hp-wan/6VLCV_pBR5GCsqmDLecUHwcOTWA/ from Brian E Carpenter

b) Using RoCE for distributed AI training https://mailarchive.ietf.org/arch/msg/hp-wan/9N7KoMpOfKr4_bDro_tWyIzv-Kg/ from Hesham ElBakoury

xiongquan1230 commented 3 months ago

Question 3:What about modifying TCP window sizes or using a modified TCP or UDP stack? Several initiatives seem to be for managing super-large data flows using TE technologies, QUIC, and other techniques.

The problem was proposed by King, Daniel. The mailing list link is https://mailarchive.ietf.org/arch/msg/hp-wan/L3Sq0FoTemJMFdmWgwH3mxr4uVE/.

gorryfair commented 1 month ago

Before proposing work on ROCE, you may need to confirm if open standards development for ROCE is within the scope pop the IETF.

bpatient78 commented 1 month ago

Before proposing work on ROCE, you may need to confirm if open standards development for ROCE is within the scope pop the IETF.

Hi, Gorry, thanks for your kind remind. I want to clarify that RoCE related work is just for background information, I think only iWARP related work or new RDMA open standards can be proposed in IETF.