Open sarum9in opened 11 years ago
I can reproduce this issue.
Looking at memory.txt I currently don't see a way how to exclude the cache-accounting from the highwater value (memory.max_usage_in_bytes).
At least I'll change the program output/README such that this value is not reported as just RSS.
Output/README was changed in:
https://github.com/gsauthof/cgmemtime/commit/8098f9bdce1af4e85cbf5d7f3d2a06083e1bb3ed
Any updates on this? Any hope of improvement with cgroup-v2?
This issue affects the stability of our performance measurements. When the OS has been under heavy stress and caches have been dropped, we get completely incoherent measurements (eg. 10x above normal).
It can be reproduced with any simple program that reads data from disk, before and after dropping caches with "sync; echo 3 > /proc/sys/vm/drop_caches".
@beatmax , no, there is no update so far. Unfortunately, the cgroup-v2 interface doesn't provide any high-water information (cf. Section 5.2, Memory). That means It doesn't even provide the high-water information cgroup-v1 does.
https://www.kernel.org/doc/Documentation/cgroups/memory.txt , section 5.2 for definitions.
You can find, that usage_in_bytes display RSS+CACHE value, that is not RSS (rss + file_mapped from memory.stat) as said in documentation above (see second note in 5.2).
Example: consider file "test" ~512MiB.
As you can see, recursive RSS value (generated from max_usage_in_bytes) is affected by file cache. In the first example it was disabled by sending output to pipe, which is not affected by cache. Last dd was not accounted by cgroup. In the second example dd wrote it itself, so cache was accounted, giving us incorrect value.
P.S. Currently I am working on memory accounting problem with cgroup help. If you have any ideas how to implement real RSS high-water accounting without CACHE, please contact me or post it in reply.