Closed honggyukim closed 3 months ago
Hi SeongJae, this is just to give you a rough idea so didn't complete the patch yet. I will update it if you don't have objection on this.
Example usage is as follows. Please note that the config.yaml
has some useful comments.
$ cat config.yaml
kdamonds:
- state: null
pid: null
contexts:
- ops: paddr
targets:
- pid: null
regions:
- start: 4,294,967,296 # '0x100000000'
end: 19,025,362,943 # '0x46dffffff'
intervals:
sample_us: 100 ms # 20 times of default sampling interval
aggr_us: 2 s # 20 times of default aggregation interval
ops_update_us: 20 s # This can be removed as paddr doesn't use it,
# but cannot be removed for now due to it has
# to be higher than aggr_us anyhow.
nr_regions:
min: '10'
max: 1,000
schemes: []
This config.yaml
can simply be passed to damo start
.
$ sudo ./damo.py start config.yaml
The damo status
shows it is correctly configured as follows.
$ sudo ./damo.py status
kdamond 0
state: on, pid: 25786
context 0
ops: paddr
target 0
pid: 0
region [4,294,967,296, 19,025,362,943) (13.719 GiB)
intervals: sample 100 ms, aggr 2 s, update 20 s
nr_regions: [10, 1,000]
The same config file in json format is longer than yaml format as follows.
{
"kdamonds": [
{
"state": null,
"pid": null,
"contexts": [
{
"ops": "paddr",
"targets": [
{
"pid": null,
"regions": [
{
"start": "4,294,967,296",
"end": "19,025,362,943"
}
]
}
],
"intervals": {
"sample_us": "100 ms",
"aggr_us": "2 s",
"ops_update_us": "20 s"
},
"nr_regions": {
"min": "10",
"max": "1,000"
},
"schemes": []
}
]
}
]
}
Thank you for this PR, Honggyu!
To my perspective, this PR is making two changes:
I think the first change will be very useful, as you nicely explained, and would like to merge it as soon as you drop RFC tag.
However, I'm not pretty sure if the second change is really needed, since same thing can be achieved by converting fmt_json
output to a yaml file. Also, we might want to support yet another format of DAMON parameters in future. In long term, I think having a new command for general converting of DAMON parameters in multiple formats, and eventually deprecating fmt_json
might be a better way. The command may supporting conversion of DAMON parameters in command line options, json, or yaml formats to different formats. E.g.,
$ damo convert_damon_parameters --damos_action stat --convert_output_format json
[...]
$ damo convert_damon_parameters ./damon_parameters.json --convert_output_format yaml
[...]
And I don't mind merging the first change now, and implementing the long term solution later.
May I ask your opinion, Honggyu?
Hi SeongJae,
Thanks for the useful feedback.
To my perspective, this PR is making two changes:
- Supporting yaml format DAMON parameter inputs, and
Do you mean supporting yaml format file input such as running damo start config.yaml
?
- Supporting command line options based DAMON parameters to yaml format DAMON parameters input conversion.
I think the first change will be very useful, as you nicely explained, and would like to merge it as soon as you drop RFC tag.
However, I'm not pretty sure if the second change is really needed, since same thing can be achieved by converting
fmt_json
output to a yaml file.
Do you mean you're not sure about adding fmt_yaml
command?
Also, we might want to support yet another format of DAMON parameters in future. In long term, I think having a new command for general converting of DAMON parameters in multiple formats, and eventually deprecating
fmt_json
might be a better way.
Yeah, I actually thought that the name fmt_json
isn't very good. I think having damo config
command might be better and providing --format=[json|yaml]
option can be useful.
The command may supporting conversion of DAMON parameters in command line options, json, or yaml formats to different formats. E.g.,
$ damo convert_damon_parameters --damos_action stat --convert_output_format json [...] $ damo convert_damon_parameters ./damon_parameters.json --convert_output_format yaml [...]
I feel the command name convert_damon_parameters
looks too long and verbose and it even requires another long option --convert_output_format
, which isn't good for general users as they can make typo errors very easily.
And I don't mind merging the first change now, and implementing the long term solution later.
That might be also useful as a first step for yaml support.
Thanks!
Do you mean supporting yaml format file input such as running damo start config.yaml? [...] Do you mean you're not sure about adding fmt_yaml command?
Right.
I think having
damo config
command might be better and providing--format=[json|yaml]
option can be useful.
I concern if config
might be too short and unclear. I think at least gen_config
or mkconfig
might be better name, but still I'm unsure if it is clearly explaining what it does (e.g., what config?).
Yeah, I actually thought that the name fmt_json isn't very good.
Cannot agree more. I indeed always wanted to change the name.
[...] I feel the command name
convert_damon_parameters
looks too long and verbose and it even requires another long option--convert_output_format
, which isn't good for general users as they can make typo errors very easily.
Again, agree. The names are only for example at the moment. One another idea off the top of my head is format damon_params --format {json,yaml,cmdline}
, but I'd like to take more time on this. Naming is always the most challenging and time-consuming real works of programmers :wink:
But I'm wondering what you think about the idea of the functionaility.
That might be also useful as a first step for yaml support.
Glad to get your confirm :)
Hi SeongJae,
I think having
damo config
command might be better and providing--format=[json|yaml]
option can be useful.I concern if
config
might be too short and unclear. I think at leastgen_config
ormkconfig
might be better name, but still I'm unsure if it is clearly explaining what it does (e.g., what config?).
Yeah, damo config
might be unclear, but just wanted to make it short. I feel gen_config
and mkconfig
both sound good, but I chose to use gen_config.py when I created a user tool in hmsdk.
Yeah, I actually thought that the name fmt_json isn't very good.
Cannot agree more. I indeed always wanted to change the name.
[...] I feel the command name
convert_damon_parameters
looks too long and verbose and it even requires another long option--convert_output_format
, which isn't good for general users as they can make typo errors very easily.Again, agree. The names are only for example at the moment. One another idea off the top of my head is
format damon_params --format {json,yaml,cmdline}
, but I'd like to take more time on this.
Yeah, let's take some time and think about it more.
Naming is always the most challenging and time-consuming real works of programmers 😉 But I'm wondering what you think about the idea of the functionaility.
Let me think more about the user interface.
And I don't mind merging the first change now, and implementing the long term solution later.
That might be also useful as a first step for yaml support.
Glad to get your confirm :)
I've just updated the patch removing fmt_yaml
support as you asked. Please read the commit message. Thanks.
I've removed fmt_yaml
support, but I would like to keep the code here for later reference. Please ignore it for now.
$ cat src/damo_fmt_yaml.py
# SPDX-License-Identifier: GPL-2.0
"""
Convert args to DAMON yaml input.
"""
from collections import OrderedDict
import yaml
import _damon
import _damon_args
class OrderedDumper(yaml.SafeDumper):
pass
def dict_representer(dumper, data):
return dumper.represent_dict(data.items())
OrderedDumper.add_representer(OrderedDict, dict_representer)
def yaml_dump(data):
return yaml.dump(data, Dumper=OrderedDumper, default_flow_style=False, sort_keys=False)
def main(args):
_damon.ensure_root_permission()
kdamonds, err = _damon_args.kdamonds_for(args)
if err:
print('invalid arguments (%s)' % err)
exit(1)
for k in kdamonds:
for c in k.contexts:
for s in c.schemes:
s.stats = None
s.tried_regions = None
if args.schemes_only:
schemes = []
for kdamond in kdamonds:
for ctx in kdamond.contexts:
schemes += ctx.schemes
print(yaml_dump([s.to_kvpairs(args.raw) for s in schemes]))
return
print(yaml_dump({'kdamonds': [k.to_kvpairs(args.raw) for k in kdamonds]}))
def set_argparser(parser):
_damon_args.set_argparser(parser, add_record_options=False)
parser.add_argument('--schemes_only', action='store_true',
help='print schemes part only')
parser.add_argument('--raw', action='store_true',
help='print numbers in machine friendly raw form')
I'm also thinking about adding a convenience script json_to_yaml.py
somewhere as follows.
$ cat json_to_yaml.py
#!/usr/bin/env python3
# SPDX-License-Identifier: GPL-2.0
import sys
import json
import yaml
def json_to_yaml(json_file_path):
with open(json_file_path, 'r') as json_file:
json_data = json.load(json_file)
return yaml.dump(json_data, default_flow_style=False,
sort_keys=False, allow_unicode=True)
if __name__ == "__main__":
if len(sys.argv) != 2:
print("Usage: ./json_to_yaml.py <json_file_path>")
sys.exit(1)
json_file_path = sys.argv[1]
yaml_str = json_to_yaml(json_file_path)
print(yaml_str)
The updated commit looks good to me. Thank you Honggyu!
The current damo accepts json format for configuration, but json has some limitation as follows.
- cannot add comment
- verbose due to inherent open-close pair of braces
I already merged this PR, but let me clarify my thoughts about the above argument.
json
doesn't support commenting, but you could have similar ones by adding fake key/value pairs in maps. For example,
{
"kdamonds": [
{
"comment": ["this kdamond is for ..."],
"state": null,
"pid": null,
"contexts": [
{
"comment": ["this context monitors physical address space"],
"ops": "paddr",
[...]
The trick doesn't work for lists, and the strict syntax of json makes it further tedious, though.
To my understanding, the new lines are not really required, so you can collapse the things. E.g.,
{
"kdamonds": [{
"state": null, "pid": null,
"contexts": [{"ops": "paddr",
"targets": [{"pid": null,
"regions": [{"start": "4,294,967,296", "end": "19,025,362,943"}]}],
[...]
Of course, that requires manual works.
As also mentioned above, those are tricks that having their own limitation and require additional works. So I believe supporting yaml is a nice idea.
Hi SeongJae, thanks for merging this patch.
I already merged this PR, but let me clarify my thoughts about the above argument.
- Commenting on json can be done with some tricks.
json
doesn't support commenting, but you could have similar ones by adding fake key/value pairs in maps. For example,
Interesting trick. But I think it requires many users to be clever enough for this.
To my understanding, the new lines are not really required, so you can collapse the things. E.g.,
Yeah, that's true, but I wish we could have a way to make the json in that format automatically. I guess it's not widely used format.
- I believe supporting yaml is nice.
As also mentioned above, those are tricks that having their own limitation and require additional works. So I believe supporting yaml is a nice idea.
Thanks. We can think more about the user interface for generating yaml config just like damo fmt_json
does.
FYI, I'm planning to integrate every damo command for creating some sort of outputs including fmt_json
into one (probably show
). That may help supporting yaml-format DAMON parameter formatting (https://github.com/awslabs/damo/issues/105#issuecomment-2226357097)
Thanks. That would be useful!
FYI, the yaml
format support is added to a new command, args damon
(https://github.com/awslabs/damo/issues/105#issuecomment-2241282173).
Thanks for the update. But it looks the output is incorrect.
$ sudo ./damo args damon --format yaml
kdamonds:
- !!python/object/apply:collections.OrderedDict
- - - state
- null
- - pid
- null
- - contexts
- - !!python/object/apply:collections.OrderedDict
- - - ops
- paddr
- - targets
- - !!python/object/apply:collections.OrderedDict
- - - pid
- null
- - regions
- - !!python/object/apply:collections.OrderedDict
- - - start
- 4,294,967,296
- - end
- 825,707,462,655
- - intervals
- !!python/object/apply:collections.OrderedDict
- - - sample_us
- 5 ms
- - aggr_us
- 100 ms
- - ops_update_us
- 1 s
- - nr_regions
- !!python/object/apply:collections.OrderedDict
- - - min
- '10'
- - max
- 1,000
- - schemes
- []
I added OrderedDumper
to hide OrderedDict
in the yaml output before. Please see https://github.com/awslabs/damo/pull/97#issuecomment-2156630128.
Thank you for the kind comment, Honggyu. I think that part would better to be removed for simpler output. But I'm not sure if it is incorrect. Can the OrderedDict
part can make parsing of the output fail? If not, I think we can call it "too verbose" or "ugly", but "incorrect" is not technically correct?
Anyway I'll try to learn from the code you kindly shared. If you don't mind, please feel free to upload a PR instead of waiting for me, though.
The generated yaml file fails to be used for damo start
as follows.
$ sudo ./damo args damon --format yaml > test.yaml
$ sudo ./damo start test.yaml
could not turn on damon (cannot create kdamonds from args (target 'test.yaml' is not supported))
Unlike yaml format, json format works fine as expected.
$ sudo ./damo args damon --format json > test.json
$ sudo ./damo start test.json
$ sudo ./damo status
kdamond 0
state: on, pid: 308881
context 0
ops: paddr
target 0
pid: 0
region [4,294,967,296, 551,903,297,535) (510.000 GiB)
intervals: sample 5 ms, aggr 100 ms, update 1 s
nr_regions: [10, 1,000]
Interesting. Thank you for reporting this. I will take a look later. For tracking, opened an issue: https://github.com/awslabs/damo/issues/107
Thanks for creating the issue!
The fix for the issue is just pushed. Just for a clarification.
Can the OrderedDict part can make parsing of the output fail? If not, I think we can call it "too verbose" or "ugly", but "incorrect" is not technically correct?
It makes parsing fail if the parsing part uses yaml.safe_load()
. It can be parsed with custom loaders, though. So technically speaking, it is not "incorrect". Please refer to the fix commit for more details.
Thanks for the explanation. I have tried a bit with custom loader for yaml.load()
for the previously generated yaml output, but couldn't successfully load it after some trials.
Anyway, that might be because of my lack of knowledge of custom loader so I won't say that was incorrect and I will say it was too verbose. Thanks for making yaml support complete on generation side.
The current damo accepts json format for configuration, but json has some limitation as follows.
This patch introduces configuration with yaml format and its benefit is as follows.
The simple usage is as follows.