apache / incubator-uniffle

Uniffle is a high performance, general purpose Remote Shuffle Service.
https://uniffle.apache.org/
Apache License 2.0
370 stars 143 forks source link

[Bug] Too high SUnreclaim memery use and does not release memory #1479

Open lifeSo opened 7 months ago

lifeSo commented 7 months ago

Code of Conduct

Search before asking

Describe the bug

We launch 8 Shuffle Server machine, and there is one machine memory use is very high. After diagnose, we find there is SUnreclaim too high by run cmd :/etc/meminfo image

Event though RSS process is down, the memory is not released.

image

Affects Version(s)

0.7.0

Uniffle Server Log Output

No response

Uniffle Engine Log Output

No response

Uniffle Server Configurations

No response

Uniffle Engine Configurations

No response

Additional context

No response

Are you willing to submit PR?

rickyma commented 7 months ago

Can you run jmap -histo:live to show what occupies the memory when the process of RSS Server is still alive? I think it will probably be io.netty.buffer.ReadOnlyByteBufferBuf from org.apache.uniffle.common.ShufflePartitionedBlock.

lifeSo commented 7 months ago

@rickyma We don't run the jmap -histo:live cmd, because we think the rss use memory is ok, it just use half of the memory from the pic above. And the machine restarted later. If the problem show next time, I think could run slabtop