Open dliappis opened 1 year ago
Hi @dliappis, I'm interested in working on this issue, can you please assign me this. Thank you
@MadhuMPandurangi thank you for your interest. I've assigned it to you.
@dliappis I'm also interested. How about this way? I feel like I may reinvent the wheel somewhere else, though.
# erally/const.py
from collections import UserDict
class Node(UserDict):
def __init__(self):
super().__init__(self._iterattrs())
def _iterattrs(self):
attributes = vars(type(self))
for k in attributes:
if k.startswith('_'):
continue
yield k, attributes[k]
class Section(Node):
def _iterattrs(self):
for _, v in super()._iterattrs():
yield v[1], v
class SectionA(Section):
_section = "section_a"
snake_case_key = (_section, "snake_case_key")
dot_notation_key = (_section, "dot.notation.key")
class Root(Node):
section_a: SectionA = SectionA()
CONFIG_PATH = Root()
With the above, (section, key)
will be accessible like this:
from esrally.const import CONFIG_PATH as CFG
CFG.section_a.snake_case_key
CFG.section_a.dot_notation_key
CFG.section_a["dot.notation.key"]
So, the example code will be refactored in the following with some bonuses, such as nice code completion, pprint capability thanks to an inheritance from dict
, the same line length between before and after the change, etc. In addition to those, if you prefer, we can capitalize sections and keys (or inject more magics for auto-conversion) to be able to write like *CFG.SYSTEM.LIST_CONFIG_OPTION
.
cfg.add(config.Scope.applicationOverride, *CFG.system.list_config_option, args.configuration)
cfg.add(config.Scope.applicationOverride, *CFG.system.list_max_results, args.limit)
cfg.add(config.Scope.applicationOverride, *CFG.system.admin_track, args.track)
cfg.add(config.Scope.applicationOverride, *CFG.system.list_races_benchmark_name, args.benchmark_name)
cfg.add(config.Scope.applicationOverride, *CFG.system.list_from_date, args.from_date)
cfg.add(config.Scope.applicationOverride, *CFG.system.list_races_to_date, args.to_date)
--- a.py 2023-10-26 19:08:49.359047597 +0900
+++ b.py 2023-10-26 19:09:01.960758334 +0900
@@ -1 +1 @@
- cfg.add(config.Scope.applicationOverride, "system", "list.config.option", args.configuration)
+ cfg.add(config.Scope.applicationOverride, *CFG.system.list_config_option, args.configuration)
There'd be other ways, like using dataclass
or some non-standard libraries, but I thought this is probably good enough not to overkill.
cc @elastic/es-perf
@sakurai-youhei Many thanks for your interest in making Rally better, feel free to assign this issue to yourself and contribute.
I've looked at the proposal based on UserDict
. My understanding is the benefit of UserDict
is the following usage: CFG.section_a["dot.notation.key"]
but is this a requirement at all? I think all we need is to refer to config keys as class attributes, so perhaps there's a room for making it simpler? To be honest it took me a while to figure out what your code is doing.
I like the idea of returning tuples and unpacking them in cfg
API calls. I don't like the _section
repetition in the section class code.
I was thinking about something as simple as this:
class Section:
def __init__(self, name):
self._name = name
def __getattribute__(self, item):
# object.__getattribute__ required instead of self.__getattribute__ to prevent an infinite loop
return object.__getattribute__(self, "_name"), object.__getattribute__(self, item)
class System(Section):
def __init__(self):
super().__init__("system")
# description ...
list_config_option = "list.config.option"
# description ...
list_max_results = "list.max_results"
class Root:
system: System = System()
CFG = Root()
print(CFG.system.list_config_option)
print(CFG.system.list_max_results)
#print(CFG.system.does_not_exist) <--- error
with a simple path to .
removal (breaking change with config file version increment):
class Section:
def __init__(self, name):
self._name = name
def __getattribute__(self, item):
# object.__getattribute__ required instead of self.__getattribute__ to prevent an infinite loop
return object.__getattribute__(self, "_name"), item
class System(Section):
def __init__(self):
super().__init__("system")
# description ...
list_config_option = None
# description ...
list_max_results = None
I'm also curious what was your idea with the dataclass
?
To move forward I would suggest to start working on a single section (the smallest, the better) to see how it goes.
Edit: Changed super()
to object
in __getattribute__
and added a comment.
@gbanasiak Thank you for your comment.
I have played my idea, your idea, dataclass
, etc., more to see what would be the best, and I find enum
can be used to express dot-notation straightforwardly like the PR https://github.com/elastic/rally/pull/1795.
I hope you like it, as I have also incorporated the concept of avoiding repetition and giving descriptions.
P.S. I have come to the conclusion that the literal check like this PR #1798 should come first before DRY, refactoring, auto-completion, etc.
https://github.com/elastic/rally/actions/runs/6773039463/job/18406946647?pr=1798#step:5:202
@sakurai-youhei I like this approach. I'll leave more detailed comments in each PR.
The motivation of this issue is to prevent bugs like #1645.
The Config object uses various keys for storing configuration properties. Currently many of those get populated during argument parsing (e.g. https://github.com/elastic/rally/blob/4c7141aac0179c38c08c134c25f7a23e6f07d2d9/esrally/rally.py#L1024-L1029), and the same keys are referenced, again as strings, in other code locations e.g. in metrics.py.
This approach is brittle and error prone. We should DRY things up and instead reference the config options via meaningful variable names in some module (e.g. config.py) such that a typo becomes a real error.