aliyun / SimAI

Apache License 2.0
82 stars 12 forks source link

Simulation asserts failed when using HPCC #10

Closed HeRaNO closed 1 week ago

HeRaNO commented 2 weeks ago

Reproduce

  1. Same as #6 with the potential fix mentioned.
  2. Turn on NS3_ASSERT (https://github.com/aliyun/ns-3-alibabacloud/blob/master/simulation/CMakeLists.txt#L32) and NS3_LOG (https://github.com/aliyun/ns-3-alibabacloud/blob/master/simulation/CMakeLists.txt#L35)
  3. Change CC_MODE to 3 (https://github.com/aliyun/SimAI/blob/master/astra-sim-alibabacloud/inputs/config/SimAI.conf#L14)
  4. Run the simulation

Logs

assert failed. cond="ih.nhop <= IntHeader::maxHop", +0.000559210s 116 file=/root/SimAI/astra-sim-alibabacloud/extern/network_backend/ns3-interface/simulation/src/point-to-point/model/rdma-hw.cc, line=954

Output ih.nhop shows:

nhop: 31245

If change the parallel thread from 3 to 1, the ih.nhop changes to

nhop: 23053

Potential Fix

https://github.com/aliyun/SimAI/blob/master/astra-sim-alibabacloud/astra-sim/network_frontend/ns3/common.h#L685

add else before this line.