Open omry opened 4 years ago
+1. Out of curiosity, why not use the default behavior of pythons operands for any types? I.e. don't check the type (let python return its errors if need be) and just call + , - , / , * . For example do string concatenation by default with +.
String concatenation is already supported directly:
url: http://${host}:${port}:${path}
It might still be worth supporting it, but some other things does not make immediate sense, for example:
a:
bb: 10
cb: 20
b:
xx: 30
yy: 40
c: foo
d1: ${a} + ${b}
d2: ${a} + ${c}
While it's plausible to come with interpretations for both d1 and maybe even d2, I suspect the surface area from supporting those interpretation is very large. I would like to keep operator support well defined and limited to the most impactful use cases.
Ok, I would personally just use python way of doing by returning
def interpolate_binary(a,b,op):
return op(OmegaConf.to_container(a, resolve=True),
OmegaConf.to_container(b, resolve=True))
In which case both d1 and d2 would raise a TypeError. That seems like the most intuitive and easiest way of maintaining things. But maybe it's naive, I didn't actually look through OmegaConf codebase much ...
Supporting + for arbitrary values may be incompatible with existing strings:
foo: FOO
bar: BAR
key: ${foo}+${bar}
current interpretation is FOO+BAR
, new interpretation will be FOOBAR.
limiting to numerical types reduces the incompatibility impact.
There should be a way to prevent operators from executing for cases like this.
In any case, when I try to actually implement it I will think about it more seriously.
@odelalleau,
I was thinking about introducing an eval resolver later to do arithmetic operations.
It could also be using a different syntax, like #{expression}
.
two : #{1 * 2}
five : #{1 + ${two} * 2}
# or using a regular resolver:
eight : ${eval:4 + ${two} * 2}
I kinda like #{} though (although I just realized it's a yaml comment so it will have unfortunate side effects. some other symbol?
%
or ^
?).
I was thinking about introducing an eval resolver later to do arithmetic operations.
Personally I would find it more intuitive to do it the way you initially wrote at the top of this issue, rather than introduce a new resolver or syntax. Did you have any major concern with your initial proposition? (I think it should remain limited to floats & integers, which should cover most common use cases, and people can still write custom resolvers for more advanced computations).
The main concern is backward compatibility. Interpreting all configs as expressions means that you can't have a config value that is an expression:
x: 10 * 2
Quoting it will not help because at YAML level it's already a string. An interpolation/syntax allow you to differentiate the two cases.
# "10 * 2"
x: 10 * 2
# 20
y: ${eval:10 * 2}
It is possible to only process expressions if they contain interpolations, but feels like a hack. This is useful expression that will not be handled:
# 10MB
y: 10 * 1024 * 1024
I guess we could go with the new decode resolver to support the string form:
y: ${oc.decode:"10 * 1024 * 1024"} # should be 10 * 1024 * 1024
Follow-up thoughts on this (tl;dr: I think it can be implemented at the level of resolver arguments, but this would make it a bit more cumbersome to use arithmetic operators in strings especially on Hydra's CLI, so maybe a unique syntax like #{..}
is better)
rating: 5*
an incorrect arithmetic expression, or a string?)/-+*
characters that are currently allowed in unquoted strings, so for instance one should replace ${foo:a-b+c}
with ${foo:"a-b+c"}
to keep the same behavior as before.oc.decode
resolver is needed here. The way I've implemented this resolver is by evaluating its input string as if it was a resolver argument. If we actually add this feature to regular resolver arguments, then ${oc.decode:"10 * 1024 * 1024"}
simply becomes equivalent to ${identity: 10 * 1024 * 1024}
, where identity
is a resolver that just returns its input.${:10 * 1024 * 1024}
. It could also be a different syntax for convenience, like the #{..}
you suggested. Also, I'm not sure what would be the timeline for this, but personally I won't have time to implement it in 2.1. What I can easily do, however, is trigger a deprecation warning for people who are using arithmetic operators in unquoted strings (/-+*
, and also probably %
), to tell them to use quoted strings instead. This will make it easier to introduce these operations in the grammar later.language=c++
would need to be quoted, which is always ugly in the shell: language="'c++'"
. So maybe we really want a separate parsing logic, which would motivate a new syntax like #{..}
.Makes sense. I don't think we need to deprecate anything at this time (since we are not sure about the final design yet). We can deprecate things in between 2.1 and 2.1 without a major version (e.g. 2.1.3). Also - this is also absolutely not getting into 2.1, so no worries there.
A few thoughts here
${eval:}
or #{expression}
syntax mentioned a lot. That seems to be very flexible, and by using the former I think we get nested interpolations for free, right?${eval:}
make use of something like ast.literal_eval()
to support more complex expressions? Likely this should be modified to not allow any arbitrary expression since that'd be a security risk. See below for desired example from dataclasses import dataclass
import hydra
from hydra.core.config_store import ConfigStore
from omegaconf import OmegaConf
@dataclass
class Conf:
foo: str = 'bar'
baz: int = "${eval: 1 if ${foo} == 'bar' else 0}" # Magic, yet-to-be-implemented resolver
cs = ConfigStore.instance()
cs.store('config', Conf)
@hydra.main(config_name='config', config_path=None)
def main(cfg):
resolved_cfg = OmegaConf.to_container(cfg, resolve=True, enum_to_str=True)
print(OmegaConf.to_yaml(resolved_cfg))
if __name__ == '__main__':
main()
It'd be cool to have this output
python app.py
foo: bar
baz: 1
python app.py foo=other
foo: other
baz: 0
Currently to get the equivalent behavior, you need to introduce some auxiliary resolvers
def resolve_if(condition, true_value, false_value):
return true_value if condition else false_value
def resolve_eq(first, second):
return first == second
@dataclass
class Conf:
foo: str = 'bar'
baz: int = "${if: ${eq: ${foo}, 'bar'}, 1, 0}"
OmegaConf.register_new_resolver('if', resolve_if)
OmegaConf.register_new_resolver('eq', resolve_eq)
Personally I find the eval
interpolation preferable since the desired logic is immediately clear to anyone who understands Python. In comparison, the workaround would require the user to check the source code registered to each custom resolver in order to clarify what the expected result would be. Lastly, the eval
option doesn't require nested curly-braces
${if: ${eq: ${foo}, 'bar'}, 1, 0}
${eval: 1 if ${foo} == 'bar' else 0}
and by using the former I think we get nested interpolations for free, right?
Hopefully xD
I'd like to remark that using python's builtins.eval
already gets much of the job done:
from typing import Any
from omegaconf import OmegaConf
OmegaConf.register_new_resolver("eval", eval)
cfg = OmegaConf.create('a: "Computed ${eval: 2 * 3 * 5}"')
print(cfg.a) # Prints 'Computed 30'
Using nested interpolations is a little bit tricky, but careful quoting makes it possible:
cfg = OmegaConf.create(
"""
foo: bar
baz: ${eval:'1 if "${foo}" == "bar" else 0'}
"""
)
print(cfg.baz) # Prints '1'
The interpolation "${foo}"
needs to be quoted so that eval
will interpret its value as a string "bar"
and not as a variable bar
(which would cause a NameError).
@Jasha10 That's really cool! Ideally I would want there to be a syntax that works in all cases, but the spaces and quotes really make things quite tricky.
After importing/registering, I ran the following tests:
# Test A: Quote the entire RHS, with a space after `eval:`
def test_1a_eval_outer_with_space():
cfg = OmegaConf.create("""
a: '${eval: 2 * 3 * 5}'
""")
assert cfg.a == 30 # => pass!
def test_2a_nested_outer_with_space():
cfg = OmegaConf.create("""
foo: bar
baz: '${eval: 1 if "${foo}" == "bar" else 0}'
""") # => GrammarParseError: token recognition error at: '='
assert cfg.baz == 1
# --------------------------------------------------------------------
# Test B: Quote the insides only, with a space after `eval:`
def test_1b_eval_inner_with_space():
cfg = OmegaConf.create("""
a: ${eval: '2 * 3 * 5'}
""") # ^ => ScannerError: mapping values are not allowed here
assert cfg.a == 30
def test_2b_nested_inner_with_space():
cfg = OmegaConf.create("""
foo: bar
baz: ${eval: '1 if "${foo}" == "bar" else 0'}
""") # ^ => ScannerError: mapping values are not allowed here
assert cfg.baz == 1
# --------------------------------------------------------------------
# Test C: Quote entire RHS, with no space after `eval:`
def test_1c_eval_outer_no_space():
cfg = OmegaConf.create("""
a: '${eval:2 * 3 * 5}'
""")
assert cfg.a == 30 # => pass!
def test_2c_nested_outer_no_space():
cfg = OmegaConf.create("""
foo: bar
baz: '${eval:1 if "${foo}" == "bar" else 0}'
""") # => GrammarParseError: mismatched input '"' expecting BRACE_CLOSE
assert cfg.baz == 1
# --------------------------------------------------------------------
# Test D: Quote the insides only, with no space after `eval:`
def test_1d_eval_inner_no_space():
cfg = OmegaConf.create("""
a: ${eval:'2 * 3 * 5'}
""")
assert cfg.a == 30 # => pass!
def test_2d_nested_inner_no_space():
cfg = OmegaConf.create("""
foo: bar
baz: ${eval:'1 if "${foo}" == "bar" else 0'}
""")
assert cfg.baz == 1 # => pass!
# --------------------------------------------------------------------
The best option seems to be method D, immediately following eval:
with a quoted string, without putting a space in between.
But I don't have a clue as to why...
The best option seems to be method D, immediately following
eval:
with a quoted string, without putting a space in between.But I don't have a clue as to why...
If you don't quote the string then you are restricted to only characters allowed in unquoted strings, so things will break as soon as you use another character in your expression. For instance, quotes are not allowed, which explains some of your errors.
Putting a space after eval
actually works at the OmegaConf level, but it's failing for you because you're using YAML syntax and YAML doesn't like it.
Here are two ways to fix it (besides removing the space):
# Dropping YAML
def test_2b_nested_inner_with_space():
cfg = OmegaConf.create(
{
"foo": "bar",
"baz": """${eval: '1 if "${foo}" == "bar" else 0'}"""
}
)
assert cfg.baz == 1
# Quoting the YAML string (which requires escaping the inner quotes for the YAML parser)
# (note that in a .yaml file you would use a single \, here it is doubled because it is in a Python string)
def test_2b_nested_inner_with_space():
cfg = OmegaConf.create("""
foo: bar
baz: "${eval: '1 if \\"${foo}\\" == \\"bar\\" else 0'}"
""")
assert cfg.baz == 1
Thanks, that's helpful. That also explains why the ScannerError
s are YAML errors, but the GrammarParseError
s are OmegaConf errors.
But I have not understood how in this syntax how you can write things like:
a:
aa: 64
b:
bb: ${eval: '2 * ${a.aa}'}
The above should work but it doesn't in my case :(
a: aa: 64 b: bb: ${eval: '2 * ${a.aa}'}
The above should work but it doesn't in my case :(
It's probably just a YAML parsing error, you can fix it by removing the space after eval:
or enclosing bb's definition within double quotes.
I did this and now I get this: omegaconf.errors.UnsupportedInterpolationType: Unsupported interpolation type eval
Hi @ysig, see the OmegaConf docs on How to Perform Arithmetic Using eval as a Resolver.
I have seen them but I don't see where my error is. After all I just use multiplication.
On Tue, Dec 20, 2022, 14:15 Jasha Sommer-Simpson @.***> wrote:
Hi @ysig https://github.com/ysig, see the OmegaConf docs on How to Perform Arithmetic Using eval as a Resolver https://omegaconf.readthedocs.io/en/2.3_branch/how_to_guides.html#id1
— Reply to this email directly, view it on GitHub https://github.com/omry/omegaconf/issues/91#issuecomment-1359274045, or unsubscribe https://github.com/notifications/unsubscribe-auth/AGY7H2OEOBIXVHNUU2UP25LWOGPPJANCNFSM4JTY26UA . You are receiving this because you were mentioned.Message ID: @.***>
I have seen them but I don't see where my error is. After all I just use multiplication.
You are most likely forgetting to register the eval
resolver:
OmegaConf.register_new_resolver("eval", eval)
Am I the only one concerned about the fact that registering eval
as a resolver opens a can of worms from a security perspective?
It opens the door to arbitrary code execution via configuration, so I find it hard to believe that it is the suggested official workaround! Moreover, such a suggestion is IMHO also inconsistent with other Hydra design choices. Take for instance the hydra.utils.instantiate
API: its purpose is evidently to allow to instantiate classes and call functions from configuration without allowing arbitrary code execution. If an eval
resolver was indeed considered a "valid choice" by Hydra documentation, then why should the hydra.utils.instantiate
API even exist, given that its use case is already covered by this resolver? The same applies to other more specific resolvers as well: why would they be needed if an eval
resolver exists? These questions are just to illustrate that suggesting an eval
resolver in the official docs is inconsistent with the rest of the API (which takes safe design choices).
Please don't get me wrong: I am a happy user of Hydra and I am thankful to all the contributors that work on this framework, I use the hydra.utils.instantiate
API and several specific resolvers, and I will welcome the feature described in this ticket, but I will definitely NOT include an eval
resolver in my application: it's a security risk and there are several better workarounds. Please consider removing the eval
suggestion from the docs.
Please consider removing the
eval
suggestion from the docs.
I agree it'd be good to add a warning, but IMO it's a useful tip that's worth keeping in the docs (there are many situations where people don't care about such a security risk -- I know I'm a happy user of the eval
resolver and I'm well aware that it allows arbitrary code execution through the config).
Regarding your comment on eval
making instantiate
useless, note that (i) eval
is not available by default (and shouldn't be, for the exact reason you mention here), and (ii) it's not as convenient to use as the instantiate
API to create complex objects.
Opinions can differ of course, but IMO just adding a warning because it can be a useful tip is not enough given that we are talking about the official docs. For one person like you that is using it who is perfectly conscious of the risks, I bet there is another (and probably more) who will use it without thinking a moment about the risks because "it's in the docs". If I didn't know better, I would happily include it because it's so-damn-practical!
I don't consider eval
an absolute bad practice: there are some scenarios where its use can be acceptable. But this is exactly THE scenario where using eval
is a no-go unless you really, really know what you are doing and you are in complete control.
We are talking about a one-liner that makes the entire application unsecure, just its presence means that any parameter that is configurable by the end-user can be abused for code execution. Just imagine a multi-dev scenario where a developer added this resolver, and another one unaware of it exposes an innocuous parameter on the front-end...
Since Hydra is a configuration framework, I really think this is where the docs should draw a line in the sand and absolutely discourage this. IMO, putting it in the docs means there is some blessing from the framework on using this pattern, even with a warning... Unless you make the warning really huge, but then why did you even include it in the docs?
IMO such a suggestion does not belong to official docs, rather to a stackoverflow thread so that it can get up/downvoted and commented, with other safer alternatives discussed.
The choice is down to Hydra developers of course, but personally, I think that a better solution would be either
add
, subtract
etc (with a couple examples, it's going to be a couple of lines); orRegarding your comment on eval making instantiate useless
My point was rather that this suggestion in the docs is inconsistent with the presence of other safer alternatives, not that instantiate
becomes absolutely useless. Anyway, the main point is the security aspect.
@teocasse what if there existed a oc.eval_num
resolver that would call eval()
under the hood, but would refuse to do so if there is any aphabetic character within the string? ([a-zA-Z_]
) Off the top of my head I don't think you can abuse eval()
without such a character (it would unfortunately prevent numbers like 10_000
unless we make the parsing smarter, but that's a small price to pay for safety).
I cannot comment whether the specific implementation you suggest would be secure, but on a more general level I would not be against a "safer" eval implementation. With eval
the common recommendation is to make sure that your input is sanitized before passing it, and your suggestion goes exactly in that direction. What was concerning me wasn't the usage of eval
per-se, rather its "naïve" usage in the suggested resolver.
This post suggests a few alternatives to "plain eval" in order to parse mathematical expressions in a secure way: https://stackoverflow.com/questions/2371436/evaluating-a-mathematical-expression-in-a-string
eval
is atrociously unsafe. This would be recreating the very famous historically gigantic yaml security vulnerabilities of arbitrary code execution from the configs.
People don't expect to have to monitor configs for arbitrary code execution injection, and don't do it.
desired behavior:
This should work only if both operands are numbers (int or float). result type should be int if both are ints or float otherwise.
EDIT (2022-12-20):
✨✨✨ See the OmegaConf docs on How to Perform Arithmetic Using eval as a Resolver ✨✨✨