sonic-net / sonic-buildimage

Scripts which perform an installable binary image build for SONiC
Other
723 stars 1.38k forks source link

[Soaking Test] In long term test with DMAC mobility, the memory runs out. #4357

Open tim-rj opened 4 years ago

tim-rj commented 4 years ago

Description

Steps to reproduce the issue:

  1. Port 1 and Port 2 are members of Vlan2, and Port 1 and Port 2 are connected to the tester respectively.
  2. Send the same traffic with 2k DMAC to the two ports.

Describe the results you received: Memory continues to grow until memory runs out.

Describe the results you expected: Memory shouldn't run out.

Additional information you deem important (e.g. issue happens only occasionally):

**Output of `show version`:**

```
SONiC Software Version: SONiC.201811_rj.0-85995e0-20200307.004601
```

From a dump out of memory information, the cause is that the SYNCD put the message of MAC address changes in the message queue, and start a thread to retrieve messages from the queue to updates Redis database. While Redis is busy, which leads to slow update Redis database. A backlog of messages in the message queue causes memory usage increasement. Shutdown the interface or stop the stream, and memory usage will slowly decrease to normal level. Solution: This problem can be solved by neutralizing MAC addition and aging messages.

xinliu-seattle commented 4 years ago

Need help on scaling DMAC. @tim-rj, will you be able to contribute a fix on this? thanks.

tim-rj commented 4 years ago

Does scaling DMAC means large amount of DMAC? After communicating with developer team, they used jemalloc to resolve this problem. We want to confirm which solution you need. If you need the solution posted in this issue, we will work on it. -BTW, it seems unable to submit PR in 201803 release. Is this branch locked? Thanks a lot.