pingcap / ng-monitoring

Apache License 2.0
31 stars 22 forks source link

Abnormal memory usage #267

Open mornyx opened 2 months ago

mornyx commented 2 months ago

Phenomenon

We found that ng-monitoring sometimes uses unexpected amounts of memory, up to tens of GiB (on a cluster with a medium number of nodes). It may even cause OOM.

Troubleshooting

By observing the buffer/cache usage of memory, it is found that a large amount of memory usage comes from badger. It is worth noting that whenever there is high memory usage, it is usually accompanied by abnormal disk space usage of the docdb directory. The docdb directory is maintained by badger. In particular, whenever the problem occurs, there will be a large number of .sst files (hundreds to thousands) in the docdb directory. It seems that badger's LSM-tree compaction did not work as expected. Combined with the buffer/cache usage, we infer that the abnormal memory usage is caused by abnormal disk files.

Purpose

  1. Fix the problem of large number of .sst files in badger and the abnormal memory usage caused by it through upgrading or other means.
  2. Try using another storage engine, such as SQLite. (#116)
mornyx commented 2 months ago

/assign