awslabs / damo

DAMON user-space tool
https://damonitor.github.io/
GNU General Public License v2.0
148 stars 28 forks source link

Question about damos_quota at reclaim.c #82

Open honggyukim opened 8 months ago

honggyukim commented 8 months ago

Hi SeongJae,

I use quotas feature for my implementation but I feel that I don't fully understand the meaning of each field.

Could you please explain about the damos_quota setting in mm/damon/reclaim.c?

static struct damos_quota damon_reclaim_quota = {
        /* use up to 10 ms time, reclaim up to 128 MiB per 1 sec by default */
        .ms = 10,
        .sz = 128 * 1024 * 1024,
        .reset_interval = 1000,
        /* Within the quota, page out older regions first. */
        .weight_sz = 0,
        .weight_nr_accesses = 0,
        .weight_age = 1
};
DEFINE_DAMON_MODULES_DAMOS_QUOTAS(damon_reclaim_quota);

Since I use 100ms of sampling interval and 2 seconds of aggregation interval, which are 20 times of the default setting, I have multiplied above each value by 20 to match the proportion, but I'm not quite sure if those numbers are sensible.

It would be helpful for me to tune my damos_quota if I can better understand other use cases such as reclaim.c.

Thanks!

sj-aws commented 8 months ago

Hello Honggyu,

DAMOS accounts the quota usage of DAMOS and resets the count every reset_interval. Meanwhile, current mainline implementation of DAMOS applies the actions every aggregation interval.

In your case, since the aggregation interval is 2 seconds, DAMON_RECLAIM will reclaim up to 128 MiB per 2 seconds, in effect. So, this might not make sense depending on use cases. This kind of issue makes having large aggregation interval difficult.

Fortunately, we recently found a similar issue. We believe this is due to the unnecessary alignment of DAMOS action apply interval and the access sampling results aggregation interval. The alignment was needed since DAMON can provide the monitoring results snapshot of good quality only for each aggregation interval.

Hence we made DAMON to provide reasonable monitoring results regardless of aggregation interval[1], and made DAMOS action apply interval independent of the aggregation interval[2].

The patches are currently merged in damon/next and mm trees. damo is also updated to support the features. Could you please check if those can help your case?

[1] https://lore.kernel.org/damon/20230915025251.72816-1-sj@kernel.org/ [2] https://lore.kernel.org/damon/20230916020945.47296-1-sj@kernel.org/

honggyukim commented 8 months ago

Hi SeongJae, thanks for the comment. I will see if those can enhance our workload.

honggyukim commented 8 months ago

Hi SeongJae,

Here is the setting that I have tested.

                            "quotas": {
                                "time_ms": "1 s",
                                "sz_bytes": "50 GiB",
                                "reset_interval_ms": "20 s",
                                "weights": {
                                    "sz_permil": "0 %",
                                    "nr_accesses_permil": "0 %",
                                    "age_permil": "1 %"
                                }
                            },

I felt this can aggressivly apply DAMOS actions because it applies 50GiB within 1 second. But I've observed that it doesn't apply DAMOS action aggressively but more conservatively. I might have misunderstood about the relation of time_ms, sz_bytes and reset_interval_ms.

I have looked at the related kernel code for reset_interval at https://github.com/torvalds/linux/blob/v6.4/mm/damon/core.c#L967-L1012, but the logic doesn't look straight to me.

Could you please explain more about the meaning of each field? If possible, it'd be great if the kernel code has more descriptive comments for other code readers as well.

Thanks!

honggyukim commented 8 months ago

I have observed that this quota setting dropped the CPU usage roughly from 30% to 5%. I would like to control this quota with better understanding.

This might be opposite thought, but I also think that the CPU usage doesn't have to be seriouly considered in modern data center servers because they mostly have more than enough cores in many cases. I guess using 1 or 2 cores among them shouldn't be a big problem if the given DAMOS action manages the resource more efficiently.

But I do want to use quota to adjust aggressiveness of DAMOS action application rather than keeping CPU usage low. I see that this cannot be controlled by just increasing "age" without using quota.

honggyukim commented 8 months ago

I see quota related changes at the following commits.

  1. https://github.com/torvalds/linux/commit/2b8a248d5873343aa16f6c5ede30517693995f13: mm/damon/schemes: implement size quota for schemes application speed control
  2. https://github.com/torvalds/linux/commit/1cd2430300594a230dba9178ac9e286d868d9da2: mm/damon/schemes: implement time quota
  3. https://github.com/torvalds/linux/commit/38683e003153f7abfa612d7b7fe147efa4624af2: mm/damon/schemes: prioritize regions within the quotas
sj-aws commented 8 months ago

Could you please explain more about the meaning of each field?

The document is available at https://docs.kernel.org/mm/damon/api.html#c.damos_quota and I understand both the code and the document are not in good quality. If you could point out which specific part is hard to understand, I think I may be able to give you better answer. That would also help making the code and document better to read for everyone.

FYI, you could use DAMOS stats for better understanding how your quotas are applied.

I would like to control this quota with better understanding. [...] I see that this cannot be controlled by just increasing "age" without using quota.

Glad to hear you want to use it better. I also believe the quota is very important for in-production DAMOS uses, but using it well is not that simple. Stay tuned for some more planned improvement of it and imaginable use cases of it including tiered memory management, which will be shared at KernelSummit (https://lpc.events/event/17/contributions/1624/), or DAMON mailing list just before or after the event.

CPU usage doesn't have to be seriouly considered in modern data center servers

Agreed. For such cases, the users could spawn multiple kdamonds to use multiple CPUs.