Open bpatient78 opened 3 months ago
Question 1: What is the special, novel, IETF-specific problem for congestion control algorithm in HP-WAN?
Some points expressed during discussion so far. a) BBR should be less susceptible to packet loss. Why not use, e.g., BBR, and window scaling? Do they even have a problem that needs solving? BBR is not specific for HP-WAN. https://mailarchive.ietf.org/arch/msg/hp-wan/RtycnJAW_nTq2CzuFSCLiOKYY44/ https://mailarchive.ietf.org/arch/msg/hp-wan/KMFuaiPyzViNQvRgEj-BFaJa2Bc/ https://mailarchive.ietf.org/arch/msg/hp-wan/780fDjI0iMOgttC8Jc_CwK3tUjc/
b)What is mostly needed here is the Congestion Control that would work over WAN. DCQCN would not tolerate the so long feedback loop. Would HPCC tolerate it? (I believe not). BBR is strictly out of HPC discussion.CC here is DCQCN/HPCC (DCTCP for old installations). https://mailarchive.ietf.org/arch/msg/hp-wan/eOz6gCqjzCb5DWQn_tgyxhIw6iQ/ https://mailarchive.ietf.org/arch/msg/hp-wan/cmDyEaaITIfCzeEkiOT-ukuSb9k/
c)IETF is very relevant for Enterprise WAN, Campus, or DC. IETF is not just for "the Internet".CC is by 70% dependent on CCA/host, but 30% is on AQM/Router side. It is a valid point for integrability. IETF job. The coordination is required between host and routers. https://mailarchive.ietf.org/arch/msg/hp-wan/DRBfmI956w_JHmT1c7ak6lzPVQw/ https://mailarchive.ietf.org/arch/msg/hp-wan/BKycAqPwZ56GxZRMnUtqYYuUeso/
Question2: What is the applicability of RDMA techniques being used across the WAN to carry massive data, negating the need for TCP or UDP modifications and subsequent (L2/L3) overhead?
Some points expressed during discussion so far.
a)RoCE has InfiniBand encapsulated but it is just very expensive to refactor it. iWARP would be much better because it assumes normal TCP/IP (better tolerate the WAN reality). Emulating InfiniBand over the Internet is impossible. It is out of IETF scope https://mailarchive.ietf.org/arch/msg/hp-wan/zWXzv7Tx64coRnKOUyT8zSOvQnA/ from Vasilenko Eduard https://mailarchive.ietf.org/arch/msg/hp-wan/6VLCV_pBR5GCsqmDLecUHwcOTWA/ from Brian E Carpenter
b) Using RoCE for distributed AI training https://mailarchive.ietf.org/arch/msg/hp-wan/9N7KoMpOfKr4_bDro_tWyIzv-Kg/ from Hesham ElBakoury
Question 3:What about modifying TCP window sizes or using a modified TCP or UDP stack? Several initiatives seem to be for managing super-large data flows using TE technologies, QUIC, and other techniques.
The problem was proposed by King, Daniel. The mailing list link is https://mailarchive.ietf.org/arch/msg/hp-wan/L3Sq0FoTemJMFdmWgwH3mxr4uVE/.
Before proposing work on ROCE, you may need to confirm if open standards development for ROCE is within the scope pop the IETF.
Before proposing work on ROCE, you may need to confirm if open standards development for ROCE is within the scope pop the IETF.
Hi, Gorry, thanks for your kind remind. I want to clarify that RoCE related work is just for background information, I think only iWARP related work or new RDMA open standards can be proposed in IETF.
BBR https://datatracker.ietf.org/meeting/104/materials/slides-104-iccrg-an-update-on-bbr-00 https://datatracker.ietf.org/meeting/105/materials/slides-105-iccrg-bbr-v2-a-model-based-congestion-control-00 https://www.ietf.org/proceedings/106/slides/slides-106-iccrg-update-on-bbrv2-00 slides-109-iccrg-update-on-bbrv2-00 (ietf.org) slides-110-iccrg-bbr-updates-00.pdf (ietf.org) https://datatracker.ietf.org/meeting/112/materials/slides-112-iccrg-bbrv2-update-00 BBR: IETF 117 (ICCRG) [Jul 2023] BBRv3 in the public Internet: a boon or a bane? https://dl.acm.org/doi/10.1145/3673422.3674889
BBR in Linux https://www.techrepublic.com/article/how-to-enable-tcp-bbr-to-improve-network-speed-on-linux/