MaterializeInc / materialize

The Cloud Operational Data Store: use SQL to transform, deliver, and act on fast-changing data.
https://materialize.com
Other
5.72k stars 466 forks source link

feature-benchmark: Measure peak memory usage using various statistics (possibly including jemalloc, getrusage, etc.) #23524

Open def- opened 9 months ago

def- commented 9 months ago

Feature request

Instead of the currently used memory consumption at the end of the test. I think peak makes more sense, so we should probably not introduce an additional measure, but replace the existing memory consumption measure with it.

philip-stoev commented 9 months ago

I think what needs to be measured is the maximum reported from all sources, jemalloc, /proc/, docker stats, you name it. This ensures proper operation in case any of those tends to under-count or not reflect actual or peak usage at all.

antiguru commented 9 months ago

If you have access to metrics, mz_metrics_libc_ru_maxrss on a clusterd reports getrusage's maxrss attribute.

umanwizard commented 9 months ago

I don't think we should use jemalloc stats at all as it can overcount. For example, a very large allocation will count as allocated according to jemalloc but won't result in actual physical memory usage until individual pages are touched. If possible, we should use getrusage as @antiguru suggests.